Regex - Matching Abbreviations of a Word
I was thinking in providing the following regex as an answer to this question, but I can't seem to write the regular expression I was looking for:
w?o?r?d?p?r?e?s?s?
This should matc开发者_运维问答h a ordered abbreviation of the word wordpress
, but it can also match nothing at all.
How can I modify the above regex in order for it to match at least 4 chars in order? Like:
- word
- wrdp
- press
- wordp
- wpress
- wordpress
I'd like to know what is the best way to do this... =)
You could use a lookahead assertion:
^(?=.{4})w?o?r?d?p?r?e?s?s?$
What about php similarity checker functions?
- levenshtein
- similar_text
if ( strlen($string) >= 4 && preg_match('#^w?o?r?d?p?r?e?s?s?$#', $string) ) {
// abbreviation ok
}
This won't even run the regexp unless the string is at least 4 chars long.
i know this is not a regex, just for fun...
#!/usr/bin/python
FULLWORD = "wordprocess"
def check_word(word):
i, j = 0, 0
while i < len(word) and j < len(FULLWORD):
if word[i] == FULLWORD[j]:
i += 1; j += 1
else:
j += 1
if j >= len(FULLWORD) or i < 4 or i >= len(FULLWORD):
return "%s: FAIL" % word
return "%s: SUCC" % word
print check_word("wd")
print check_word("wdps")
print check_word("wsdp")
print check_word("wordprocessr")
精彩评论