Is there a way to know if a list of elements is on a larger list without using 'in' keyword?
I want to do this. I have two python lists, one larger than the other and I want to know is there is a way to check if the elements of the smaller开发者_开发问答 list are in the big list in the exact same order for example:
small_list = [4,2,5]
big_list = [1,2,5,7,2,4,2,5,67,8,5,13,45]
I tried using the in keyword but It did not worked :'(
def in_list(small, big):
l_sml = len(small)
l_big = len(big)
return any((big[i:i+l_sml]==small for i in xrange(l_big-l_sml+1)))
print in_list([4,2,1], [1,2,3,4,2,1,0,5]) # True
print in_list([1,2,3], [1,2,4]) # False
Hmm, maybe it's overkill, but you can use the SequenceMatcher class from difflib:
from difflib import SequenceMatcher
small_list = [4,2,5]
big_list = [1,2,5,7,2,4,2,5,67,8,5,13,45]
print SequenceMatcher(None, small_list, big_list).get_matching_blocks()
difflib documentation
Rather non-optimized, demonstrates the general strategy simply:
tuple(small_list) in zip(big_list[:], big_list[1:], big_list[2:])
The funky zip thing does this:
>>> zip(big_list[:], big_list[1:], big_list[2:])
[(1, 2, 5), (2, 5, 7), (5, 7, 2), (7, 2, 4), (2, 4, 2), (4, 2, 5), (2, 5, 67), (5, 67, 8), (67, 8, 5), (8, 5, 13), (5, 13, 45)]
A more optimized version:
from itertools import izip, islice
tuple(small_list) in izip(big_list, islice(big_list, 1, None), islice(big_list, 2, None))
To handle small_list length of any size:
from itertools import izip, islice
tuple(small_list) in izip(*(islice(big_list, i, None) for i in xrange(len(small_list))))
This problem is trickier than it seems. Unless I'm mistaken, it's a special case of the longest common substring problem.
For the general case (arbitrarily large lists), I would use some kind of finite state automaton, akin to a regular expression. I believe the result could then be calculated in O(mn) time.
That's because small_list in big_list
checks whether an element in big_list is equal to small_list. What you want to do instead is see if a slice of big_list is the same as small_list.
def isSubList(slice, L):
n = len(slice)
for i in range(0, len(L) - n):
if slice == L[i:i+n]:
return True
return False
isSubList(small_list, big_list)
Edit: Leaving the answer here but I failed to note the requirement that they be in the same order. This does not meet that requirement
Quick and dirty answer. Based it off of the answer for Python - Intersection of two lists
small_list == filter( lambda x: x in big_list, small_list)
If you know a reasonable bound of your numbers, you can convert them to a Python type whose 'in' operator does this automatically. The two I know are str
and unicode
.
Then you ask the strings if the smaller is in the larger, this does a substring comparison:
>>> small_list = [4,2,5]
>>> big_list = [1,2,5,7,2,4,2,5,67,8,5,13,45]
>>>
>>> def encode(lst):
return u"".join(unichr(c) for c in lst)
>>> encode(small_list) in encode(big_list)
True
(You can "encode" to str
if all numbers are in 0 <= x <= 255
, you can "encode" to unicode
if all are in 0 <= x <= sys.maxunicode
).
There's no built in operator doing that particular comparison. I suggest a list comprehension or a quick for loop.
you could use sets
from sets import Set
small_set = set(small_list)
big_set = set(big_list)
small_set <= big_set
<= is the subset operator
If you want to use the "in" keyword to do what you want, you can override contains using one of the solutions mentioned in the answers here:
class mylist(list):
def __contains__(self, lst):
return ':'.join(map(str, lst)) in ':'.join(map(str, self))
small_list = mylist([4,2,5])
big_list = mylist([1,2,5,7,2,4,2,5,67,8,5,13,45])
print small_list in big_list
Edit: Addresses Jeffrey's comment.
精彩评论