Suffix Array - Space Efficiency

Space Efficiency

Suffix arrays were introduced by Manber & Myers (1990) in order to improve over the space requirements of suffix trees: Suffix arrays store integers. Assuming an integer requires bytes, a suffix array requires bytes in total. This is significantly less than the bytes which are required by a careful suffix tree implementation.

However, in certain applications, the space requirements of suffix arrays may still be prohibitive. Analyzed in bits, a suffix array requires space, whereas the original text over an alphabet of size does only require bits. For a human genome with and the suffix array would therefore occupy about 16 times more memory than the genome itself.

Such discrepancies motivated a trend towards compressed suffix arrays and BWT-based compressed full-text indices such as the FM-index. These data structures require only space within the size of the text or even less.

Read more about this topic:  Suffix Array

Famous quotes containing the words space and/or efficiency:

    If we remembered everything, we should on most occasions be as ill off as if we remembered nothing. It would take us as long to recall a space of time as it took the original time to elapse, and we should never get ahead with our thinking. All recollected times undergo, accordingly, what M. Ribot calls foreshortening; and this foreshortening is due to the omission of an enormous number of facts which filled them.
    William James (1842–1910)

    I’ll take fifty percent efficiency to get one hundred percent loyalty.
    Samuel Goldwyn (1882–1974)