The Boyer Moore algorithm is the fastest string searching
algorithm.
The Boyer Moore algorithm is the fastest string searching
algorithm.
It compares the pattern with the actual string from right to
left. Most other algorithms
compare from left to right. If the character that is
compared with the rightmost pattern
symbol does not occur in the pattern at all, then the
pattern can be shifted by m positions
behind this text symbol.
The following example illustrates this situation.
Example:
0 1 2 3 4 5 6 7 8 9 ...
a b b a d a b a c b a
| |
b a b a c |
<------ |
|
b a b a c
The comparison of "d" with "c" at
position 4 does not match. "d" does not occur in the
pattern. Therefore, the pattern cannot match at any of the
positions 0,1,2,3,4, since all
corresponding windows contain a "d". The pattern
can be shifted to position 5. The best
case for the Boyer-Moore algorithm happens if, at each
search attempt the first compared
character does not occur in the pattern. Then the algorithm
requires only O(n/m)
comparisons .
Bad character heuristics
This method is called bad character heuristics. It can also
be applied if the bad character
(the character that causes a mismatch), occurs somewhere
else in the pattern. Then the
pattern can be shifted so that it is aligned to this text
symbol. The next example illustrates
this situation.
Example:
0 1 2 3 4 5 6 7 8 9 ...
a b b a b a b a c b a
|
b a b a c
<----
|b a b a c
Comparison between "b" and "c" causes a
mismatch. The character "b" occurs in the
pattern at positions 0 and 2. The pattern can be shifted so
that the rightmost "b" in the
pattern is aligned to "b".
No comments:
Post a Comment