开发者

Is there a way to know if a list of elements is on a larger list without using 'in' keyword?

I want to do this. I have two python lists, one larger than the other and I want to know is there is a way to check if the elements of the smaller开发者_开发问答 list are in the big list in the exact same order for example:

small_list = [4,2,5]
big_list = [1,2,5,7,2,4,2,5,67,8,5,13,45]

I tried using the in keyword but It did not worked :'(


def in_list(small, big):
    l_sml = len(small)
    l_big = len(big)
    return any((big[i:i+l_sml]==small for i in xrange(l_big-l_sml+1)))

print in_list([4,2,1], [1,2,3,4,2,1,0,5]) # True
print in_list([1,2,3], [1,2,4])           # False


Hmm, maybe it's overkill, but you can use the SequenceMatcher class from difflib:

from difflib import SequenceMatcher 
small_list = [4,2,5]
big_list = [1,2,5,7,2,4,2,5,67,8,5,13,45]
print SequenceMatcher(None, small_list, big_list).get_matching_blocks()

difflib documentation


Rather non-optimized, demonstrates the general strategy simply:

tuple(small_list) in zip(big_list[:], big_list[1:], big_list[2:])

The funky zip thing does this:

>>> zip(big_list[:], big_list[1:], big_list[2:])
[(1, 2, 5), (2, 5, 7), (5, 7, 2), (7, 2, 4), (2, 4, 2), (4, 2, 5), (2, 5, 67), (5, 67, 8), (67, 8, 5), (8, 5, 13), (5, 13, 45)]

A more optimized version:

from itertools import izip, islice
tuple(small_list) in izip(big_list, islice(big_list, 1, None), islice(big_list, 2, None))

To handle small_list length of any size:

from itertools import izip, islice
tuple(small_list) in izip(*(islice(big_list, i, None) for i in xrange(len(small_list))))


This problem is trickier than it seems. Unless I'm mistaken, it's a special case of the longest common substring problem.

For the general case (arbitrarily large lists), I would use some kind of finite state automaton, akin to a regular expression. I believe the result could then be calculated in O(mn) time.


That's because small_list in big_list checks whether an element in big_list is equal to small_list. What you want to do instead is see if a slice of big_list is the same as small_list.

def isSubList(slice, L):
    n = len(slice)
    for i in range(0, len(L) - n):
        if slice == L[i:i+n]:
            return True
    return False

isSubList(small_list, big_list)


Edit: Leaving the answer here but I failed to note the requirement that they be in the same order. This does not meet that requirement

Quick and dirty answer. Based it off of the answer for Python - Intersection of two lists

small_list == filter( lambda x: x in big_list, small_list)


If you know a reasonable bound of your numbers, you can convert them to a Python type whose 'in' operator does this automatically. The two I know are str and unicode.

Then you ask the strings if the smaller is in the larger, this does a substring comparison:

>>> small_list = [4,2,5]
>>> big_list = [1,2,5,7,2,4,2,5,67,8,5,13,45]
>>>
>>> def encode(lst):
      return u"".join(unichr(c) for c in lst)

>>> encode(small_list) in encode(big_list)
True

(You can "encode" to str if all numbers are in 0 <= x <= 255, you can "encode" to unicode if all are in 0 <= x <= sys.maxunicode ).


There's no built in operator doing that particular comparison. I suggest a list comprehension or a quick for loop.


you could use sets

from sets import Set
small_set = set(small_list)
big_set = set(big_list)
small_set <= big_set

<= is the subset operator


If you want to use the "in" keyword to do what you want, you can override contains using one of the solutions mentioned in the answers here:

class mylist(list):
    def __contains__(self, lst):
        return ':'.join(map(str, lst)) in ':'.join(map(str, self))

small_list = mylist([4,2,5])
big_list = mylist([1,2,5,7,2,4,2,5,67,8,5,13,45])

print small_list in big_list

Edit: Addresses Jeffrey's comment.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜