开发者

What is a more efficient way in Python to return list elements which are not in a second list?

Is there a faster way to do this in python?

[f for f in list_1 if not f in list_2]

list开发者_StackOverflow_1 and list_2 both consist of about 120.000 strings. It takes about 4 minutes to generate the new list.


If you put list_2 into a set, it should make the containment checking a lot quicker:

s = set(list_2)
[f for f in list_1 if not f in s]

This is because x in list is an O(n) check, while x in set is constant-time.

Another way is to use set-difference:

list(set(list_1).difference(set(list_2)))

However, this probably won't be faster than the first way - also, it'll eliminate duplicates from list_1 which you may not want.


Depending on what you want to do with the new list, it might be sufficient if you do some kind of lazy evaluation with itertools.ifilter() (so you don't spent time, building the new list beforehand, but you should transform list_2 to a set before in any case, so lookup is O(1)):

import itertools:
set_2 = set(list_2)

for f in itertools.ifilter(lambda x: x not in set_2, list_1):
    # do something with f
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜