Suffix Array - Space Efficiency

Space Efficiency

Suffix arrays were introduced by Manber & Myers (1990) in order to improve over the space requirements of suffix trees: Suffix arrays store integers. Assuming an integer requires bytes, a suffix array requires bytes in total. This is significantly less than the bytes which are required by a careful suffix tree implementation.

However, in certain applications, the space requirements of suffix arrays may still be prohibitive. Analyzed in bits, a suffix array requires space, whereas the original text over an alphabet of size does only require bits. For a human genome with and the suffix array would therefore occupy about 16 times more memory than the genome itself.

Such discrepancies motivated a trend towards compressed suffix arrays and BWT-based compressed full-text indices such as the FM-index. These data structures require only space within the size of the text or even less.

Read more about this topic:  Suffix Array

Famous quotes containing the words space and/or efficiency:

    ... the space left to freedom is very small. ... ends are inherent in human nature and the same for all.
    Hannah Arendt (1906–1975)

    Nothing comes to pass in nature, which can be set down to a flaw therein; for nature is always the same and everywhere one and the same in her efficiency and power of action; that is, nature’s laws and ordinances whereby all things come to pass and change from one form to another, are everywhere and always; so that there should be one and the same method of understanding the nature of all things whatsoever, namely, through nature’s universal laws and rules.
    Baruch (Benedict)