Suffix Array - Space Efficiency

Space Efficiency

Suffix arrays were introduced by Manber & Myers (1990) in order to improve over the space requirements of suffix trees: Suffix arrays store integers. Assuming an integer requires bytes, a suffix array requires bytes in total. This is significantly less than the bytes which are required by a careful suffix tree implementation.

However, in certain applications, the space requirements of suffix arrays may still be prohibitive. Analyzed in bits, a suffix array requires space, whereas the original text over an alphabet of size does only require bits. For a human genome with and the suffix array would therefore occupy about 16 times more memory than the genome itself.

Such discrepancies motivated a trend towards compressed suffix arrays and BWT-based compressed full-text indices such as the FM-index. These data structures require only space within the size of the text or even less.

Read more about this topic:  Suffix Array

Famous quotes containing the words space and/or efficiency:

    I would have broke mine eye-strings, cracked them, but
    To look upon him, till the diminution
    Of space had pointed him sharp as my needle;
    Nay, followed him till he had melted from
    The smallness of a gnat to air, and then
    Have turned mine eye and wept.
    William Shakespeare (1564–1616)

    Nothing comes to pass in nature, which can be set down to a flaw therein; for nature is always the same and everywhere one and the same in her efficiency and power of action; that is, nature’s laws and ordinances whereby all things come to pass and change from one form to another, are everywhere and always; so that there should be one and the same method of understanding the nature of all things whatsoever, namely, through nature’s universal laws and rules.
    Baruch (Benedict)