Suffix Array - Space Efficiency

Space Efficiency

Suffix arrays were introduced by Manber & Myers (1990) in order to improve over the space requirements of suffix trees: Suffix arrays store integers. Assuming an integer requires bytes, a suffix array requires bytes in total. This is significantly less than the bytes which are required by a careful suffix tree implementation.

However, in certain applications, the space requirements of suffix arrays may still be prohibitive. Analyzed in bits, a suffix array requires space, whereas the original text over an alphabet of size does only require bits. For a human genome with and the suffix array would therefore occupy about 16 times more memory than the genome itself.

Such discrepancies motivated a trend towards compressed suffix arrays and BWT-based compressed full-text indices such as the FM-index. These data structures require only space within the size of the text or even less.

Read more about this topic:  Suffix Array

Famous quotes containing the words space and/or efficiency:

    I take SPACE to be the central fact to man born in America.... I spell it large because it comes large here. Large and without mercy.
    Charles Olson (1910–1970)

    “Never hug and kiss your children! Mother love may make your children’s infancy unhappy and prevent them from pursuing a career or getting married!” That’s total hogwash, of course. But it shows on extreme example of what state-of-the-art “scientific” parenting was supposed to be in early twentieth-century America. After all, that was the heyday of efficiency experts, time-and-motion studies, and the like.
    Lawrence Kutner (20th century)