how to optimize Knuth Morris Pratt String matching algorithm
if we found a mismatch can we guess that the next character in the text is a mismatch or not without comparing,
text: abacabba
pattern: abba
now when it compares the 2nd 'b' in pattern with 2nd 'a' in text there is a mismatch is there 开发者_StackOverflow中文版any way to avoid 'c' succeeding 'a' because it would be a mismatch too. and how KMP runs on text and pattern mentioned below
text: aaaaaaa
pattern: aa
If you want to skip comparisons in the text to improve efficiency then you should go for Boyer Moore Algorithm which compares from right to left.It can have a best case runtime of Ω(n/m) and a worst case of O(n). Ω(n/m) comes as a result of predicting and skipping futile comparisons.
The Knuth Morris Pratt algorithm does left to right comparisons in which there is no way to know next characters and the KMP Prefix Table also conveys information only about the pattern.
You might even consider Boyer Moore Horspool algorithm which has better asymptotic behaviour than Boyer Moore (when pattern sizes are large).
You can check out detailed simulation graphs of all the major string searching algorithms here.
I think the algorithm of Horspool is what you are looking for. Because when there is a char matched that is not in the pattern, it will jump immediately over that part. And the rest of the pattern won't be compared with that char anymore.
精彩评论