Find subsequences of strings within strings
I want to make a function which checks a string for occurrences of other strings within them.
However, the sub-strings which are being checked may be interrupted within the main string by other letters.For in开发者_C百科stance:
a = 'abcde'
b = 'ace'
c = 'acb'
The function in question should return as b
being in a
, but not c
.
I've tried set(a)
. intersection(set(b)) already, and my problem with that is that it returns c
as being in a
.
You can turn your expected sequence into a regex:
import re
def sequence_in(s1, s2):
"""Does `s1` appear in sequence in `s2`?"""
pat = ".*".join(s1)
if re.search(pat, s2):
return True
return False
# or, more compactly:
def sequence_in(s1, s2):
"""Does `s1` appear in sequence in `s2`?"""
return bool(re.search(".*".join(s1), s2))
a = 'abcde'
b = 'ace'
c = 'acb'
assert sequence_in(b, a)
assert not sequence_in(c, a)
"ace" gets turned into the regex "a.*c.*e", which finds those three characters in sequence, with possible intervening characters.
how about something like this...
def issubstr(substr, mystr, start_index=0):
try:
for letter in substr:
start_index = mystr.index(letter, start_index) + 1
return True
except: return False
or...
def issubstr(substr, mystr, start_index=0):
for letter in substr:
start_index = mystr.find(letter, start_index) + 1
if start_index == 0: return False
return True
def issubstr(s1, s2):
return "".join(x for x in s2 if x in s1) == s1
>>> issubstr('ace', 'abcde')
True
>>> issubstr('acb', 'abcde')
False
精彩评论