开发者

How to diff 2 a very big arrays?

I have 2 very big arrays.

Is this code going to be very slow to run?


results1 = [1,2,3..]
results2 = [1,2,3,4 ... ]


for result1 in results1:
    if result1 not in results2:
        print开发者_如何学Go result1


Use a set:

hashed = set(results2)

....

    if result1 not in hashed:

Note that this needs a lot of memory if your array is really huge.

Alternatively, sort both arrays and use two indexes. If both elements are the same, increment both indexes. If they are unequal, increment the index of the array which contains the smaller element.


Try this one

l1 = [4,2,4,5,2,1,3,3,34,54,3445,4]

l2 = [5,7,4,5,8,5,2,4,56]

diff_l = list(set(l1)-set(l2))

for more operations Reference

But may not help full or perform good for huge data


I don't really get whether you want the plain difference (elements in a, but not in b) or symmetric difference (elements that are not present in both), but fortunately both can be done using just with regular set operations after converting lists to set.

But first a warning - converting list to set removes duplicate elements from the list, as set cannot contain duplicates.

So lets declare our data.:

>>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> b = [12, 11, 10, 9, 8, 7, 6]

To get plain difference, ie elements that are present in a, but not in b:

>>> set(a) - set(b)
set([0, 1, 2, 3, 4, 5])

To get symmetric difference (ie elements that are present in only one array, but not in both):

>>> set(a) ^ set(b)
set([0, 1, 2, 3, 4, 5, 10, 11, 12])

And as an added bonus, elements that are present in both:

>>> set(a) & set(b)
set([8, 9, 6, 7])
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜