Pattern Matching Python
I'm currently stuck on trying to make a naive algorithm which given a piece of a pattern e.g aabba search for it in a text e.g abbbbaababaabbaaabbaa one letter at a time. It will compare a with the text if that is right then compares the next letter and if that's wrong the whole pattern will shift one and compare a with b etc
we were give code example
print "Input text: ",
text = raw_input()
print "Input pattern: ",
pattern = raw_input()
index = text.find(pattern)
while index > -1:
print index
index = text.find(pattern, index+1)
but the find() function in p开发者_StackOverflow中文版ython is too fast(I need a non optimized sort of algorithm, using while and for loops statements I guess).
Any help appreciated, Thanks
I guess here's what you need, the following code does character-by-character comparison. You may also replace the calls to find
by iterations over text
which includes checks whether the first character of text
matches the first character of pattern
:
def my_find(text, pattern):
'''Find the start index of a pattern string in a text.
Return -1 if not found, and assume that pattern is not empty'''
found = False
current_start_index = text.find(pattern[0])
index_text = current_start_index
index_pattern = 0
while not found and index_text + len(pattern) - 1 < len(text) and \
current_start_index != -1:
index_text += 1
index_pattern += 1
while index_text < len(text) and \
index_pattern < len(pattern) and \
text[index_text] == pattern[index_pattern]:
if index_pattern == len(pattern) - 1:
found = True
break
else:
index_text += 1
index_pattern += 1
if not found:
current_start_index = text.find(pattern[0],current_start_index + 1)
index_text = current_start_index
if found:
return current_start_index
else:
-1
def my_find(haystack, needle):
n_len = len(needle)
start = 0
while start <= (len(haystack)-n_len+1):
if haystack[start:start+n_len-1] == needle:
return True
start += 1
This is, as far as I can understand, your algorithm. Not tested, will test and let you know if it works.
Tested and seems to work.
Sounds like you are learning about regular expressions, here's a snippet that may help you get started.
myFileName = "abbababaaa"
patternToMatch = "ababa"
i = 0
j = 0
while (i < len(myFileName)):
if (patternToMatch[i:i] == myFileName[j:j]):
i++
j++
else:
i = 0
if len(patternToMatch) == i:
# matched a pattern
精彩评论