Bitap Algorithm - Exact Searching

Exact Searching

The bitap algorithm for exact string searching, in full generality, looks like this when implemented in C:

#include #include typedef char BIT; /* needs only to hold the values 0 and 1 */ const char *bitap_search(const char *text, const char *pattern) { const char *result = NULL; int m = strlen(pattern); BIT *R; int i, k; if (pattern == '\0') return text; /* Initialize the bit array R */ R = malloc((m+1) * sizeof *R); R = 1; for (k=1; k <= m; ++k) R = 0; for (i=0; text != '\0'; ++i) { /* Update the bit array. */ for (k=m; k >= 1; --k) R = R && (text == pattern); if (R) { result = (text+i - m) + 1; break; } } free(R); return result; }

Bitap distinguishes itself from other well-known string searching algorithms in its natural mapping onto simple bitwise operations, as in the following modification of the above program. Notice that in this implementation, counterintuitively, each bit with value zero indicates a match, and each bit with value 1 indicates a non-match. The same algorithm can be written with the intuitive semantics for 0 and 1, but in that case we must introduce another instruction into the inner loop to set R |= 1. In this implementation, we take advantage of the fact that left-shifting a value shifts in zeros on the right, which is precisely the behavior we need.

Notice also that we require CHAR_MAX additional bitmasks in order to convert the (text == pattern) condition in the general implementation into bitwise operations. Therefore, the bitap algorithm performs better when applied to inputs over smaller alphabets.

#include #include const char *bitap_bitwise_search(const char *text, const char *pattern) { int m = strlen(pattern); unsigned long R; unsigned long pattern_mask; int i; if (pattern == '\0') return text; if (m > 31) return "The pattern is too long!"; /* Initialize the bit array R */ R = ~1; /* Initialize the pattern bitmasks */ for (i=0; i <= CHAR_MAX; ++i) pattern_mask = ~0; for (i=0; i < m; ++i) pattern_mask] &= ~(1UL << i); for (i=0; text != '\0'; ++i) { /* Update the bit array */ R |= pattern_mask]; R <<= 1; if (0 == (R & (1UL << m))) return (text + i - m) + 1; } return NULL; }

Read more about this topic:  Bitap Algorithm

Famous quotes containing the words exact and/or searching:

    His eye had become minutely exact as to the book and its position. Then he resolved that he would not look at the book again, would not turn a glance on it unless it might be when he had made up his mind to reveal its contents.
    Anthony Trollope (1815–1882)

    Abode where lost bodies roam each searching for its lost one.
    Samuel Beckett (1906–1989)