Suffix Array - Space Efficiency

Space Efficiency

Suffix arrays were introduced by Manber & Myers (1990) in order to improve over the space requirements of suffix trees: Suffix arrays store integers. Assuming an integer requires bytes, a suffix array requires bytes in total. This is significantly less than the bytes which are required by a careful suffix tree implementation.

However, in certain applications, the space requirements of suffix arrays may still be prohibitive. Analyzed in bits, a suffix array requires space, whereas the original text over an alphabet of size does only require bits. For a human genome with and the suffix array would therefore occupy about 16 times more memory than the genome itself.

Such discrepancies motivated a trend towards compressed suffix arrays and BWT-based compressed full-text indices such as the FM-index. These data structures require only space within the size of the text or even less.

Read more about this topic:  Suffix Array

Famous quotes containing the words space and/or efficiency:

    No being exists or can exist which is not related to space in some way. God is everywhere, created minds are somewhere, and body is in the space that it occupies; and whatever is neither everywhere nor anywhere does not exist. And hence it follows that space is an effect arising from the first existence of being, because when any being is postulated, space is postulated.
    Isaac Newton (1642–1727)

    I’ll take fifty percent efficiency to get one hundred percent loyalty.
    Samuel Goldwyn (1882–1974)