Suffix Array - Space Efficiency

Space Efficiency

Suffix arrays were introduced by Manber & Myers (1990) in order to improve over the space requirements of suffix trees: Suffix arrays store integers. Assuming an integer requires bytes, a suffix array requires bytes in total. This is significantly less than the bytes which are required by a careful suffix tree implementation.

However, in certain applications, the space requirements of suffix arrays may still be prohibitive. Analyzed in bits, a suffix array requires space, whereas the original text over an alphabet of size does only require bits. For a human genome with and the suffix array would therefore occupy about 16 times more memory than the genome itself.

Such discrepancies motivated a trend towards compressed suffix arrays and BWT-based compressed full-text indices such as the FM-index. These data structures require only space within the size of the text or even less.

Read more about this topic:  Suffix Array

Famous quotes containing the words space and/or efficiency:

    But alas! I never could keep a promise. I do not blame myself for this weakness, because the fault must lie in my physical organization. It is likely that such a very liberal amount of space was given to the organ which enables me to make promises, that the organ which should enable me to keep them was crowded out. But I grieve not. I like no half-way things. I had rather have one faculty nobly developed than two faculties of mere ordinary capacity.
    Mark Twain [Samuel Langhorne Clemens] (1835–1910)

    I’ll take fifty percent efficiency to get one hundred percent loyalty.
    Samuel Goldwyn (1882–1974)