Common elements comparison between 2 lists

2022-12-31 12:53 问答作者：

def common_elements(list1, list2):
    """
    Return a list containing the elements which are in both list1 and list2

    >>> common_elements([1,2,3,4,5,6], [3,5,7,9])
    [3, 5]
    >>> common_elements(['this','this','n','that'],['this','not','that','that'])
    ['this', 'that']
    """
    for e开发者_Go百科lement in list1:
        if element in list2:
            return list(element)

Got that so far, but can't seem to get it to work!

Any ideas?

Use Python's set intersection:

>>> list1 = [1,2,3,4,5,6]
>>> list2 = [3, 5, 7, 9]
>>> list(set(list1).intersection(list2))
[3, 5]

The solutions suggested by S.Mark and SilentGhost generally tell you how it should be done in a Pythonic way, but I thought you might also benefit from knowing why your solution doesn't work. The problem is that as soon as you find the first common element in the two lists, you return that single element only. Your solution could be fixed by creating a result list and collecting the common elements in that list:

def common_elements(list1, list2):
    result = []
    for element in list1:
        if element in list2:
            result.append(element)
    return result

An even shorter version using list comprehensions:

def common_elements(list1, list2):
    return [element for element in list1 if element in list2]

However, as I said, this is a very inefficient way of doing this -- Python's built-in set types are way more efficient as they are implemented in C internally.

You can also use sets and get the commonalities in one line: subtract the set containing the differences from one of the sets.

A = [1,2,3,4]
B = [2,4,7,8]
commonalities = set(A) - (set(A) - set(B))

You can solve this using numpy:

import numpy as np

list1 = [1, 2, 3, 4, 5, 6]
list2 = [3, 5, 7, 9]

common_elements = np.intersect1d(list1, list2)
print(common_elements)

common_elements will be the numpy array: [3 5].

use set intersections, set(list1) & set(list2)

>>> def common_elements(list1, list2):
...     return list(set(list1) & set(list2))
...
>>>
>>> common_elements([1,2,3,4,5,6], [3,5,7,9])
[3, 5]
>>>
>>> common_elements(['this','this','n','that'],['this','not','that','that'])
['this', 'that']
>>>
>>>

Note that result list could be different order with original list.

Set is another way we can solve this

a = [3,2,4]
b = [2,3,5]
set(a)&set(b)
{2, 3}

I compared each of method that each answer mentioned. At this moment I use python 3.6.3 for this implementation. This is the code that I have used:

import time
import random
from decimal import Decimal


def method1():
    common_elements = [x for x in li1_temp if x in li2_temp]
     print(len(common_elements))


def method2():
    common_elements = (x for x in li1_temp if x in li2_temp)
    print(len(list(common_elements)))


def method3():
    common_elements = set(li1_temp) & set(li2_temp)
    print(len(common_elements))


def method4():
    common_elements = set(li1_temp).intersection(li2_temp)
    print(len(common_elements))


if __name__ == "__main__":
    li1 = []
    li2 = []
    for i in range(100000):
        li1.append(random.randint(0, 10000))
        li2.append(random.randint(0, 10000))

    li1_temp = list(set(li1))
    li2_temp = list(set(li2))

    methods = [method1, method2, method3, method4]
    for m in methods:
        start = time.perf_counter()
        m()
        end = time.perf_counter()
        print(Decimal((end - start)))

If you run this code you can see that if you use list or generator(if you iterate over generator, not just use it. I did this when I forced generator to print length of it), you get nearly same performance. But if you use set you get much better performance. Also if you use intersection method you will get a little bit better performance. the result of each method in my computer is listed bellow:

method1: 0.8150673999999999974619413478649221360683441
method2: 0.8329545000000001531148541289439890533685684
method3: 0.0016547000000000089414697868051007390022277
method4: 0.0010262999999999244948867271887138485908508

The previous answers all work to find the unique common elements, but will fail to account for repeated items in the lists. If you want the common elements to appear in the same number as they are found in common on the lists, you can use the following one-liner:

l2, common = l2[:], [ e for e in l1 if e in l2 and (l2.pop(l2.index(e)) or True)]

The or True part is only necessary if you expect any elements to evaluate to False.

1) Method1 saving list1 is dictionary and then iterating each elem in list2

def findarrayhash(a,b):
    h1={k:1 for k in a}
    for val in b:
        if val in h1:
            print("common found",val)
            del h1[val]
        else:
            print("different found",val)
    for key in h1.iterkeys():
        print ("different found",key)

Finding Common and Different elements:

2) Method2 using set

def findarrayset(a,b):
    common = set(a)&set(b)
    diff=set(a)^set(b)
    print list(common)
    print list(diff)

There are solutions here that do it in O(l1+l2) that don't count repeating items, and slow solutions (at least O(l1*l2), but probably more expensive) that do consider repeating items.

So I figured I should add an O(l1*log(l1)+l2*(log(l2)) solution. This is particularly useful if the lists are already sorted.

def common_elems_with_repeats(first_list, second_list):
    first_list = sorted(first_list)
    second_list = sorted(second_list)
    marker_first = 0
    marker_second = 0
    common = []
    while marker_first < len(first_list) and marker_second < len(second_list):
        if(first_list[marker_first] == second_list[marker_second]):
            common.append(first_list[marker_first])
            marker_first +=1
            marker_second +=1
        elif first_list[marker_first] > second_list[marker_second]:
            marker_second += 1
        else:
            marker_first += 1
    return common

Another faster solution would include making a item->count map from list1, and iterating through list2, while updating the map and counting dups. Wouldn't require sorting. Would require extra a bit extra memory but it's technically O(l1+l2).

If list1 and list2 are unsorted:

Using intersection:

print((set(list1)).intersection(set(list2)))

Combining the lists and checking if occurrence of an element is more than 1:

combined_list = list1 + list2
set([num for num in combined_list if combined_list.count(num) > 1])

Similar to above but without using set:

for num in combined_list:
    if combined_list.count(num) > 1:
        print(num)
        combined_list.remove(num)

For sorted lists, without python special built ins, an O(n) solution

p1 = 0
p2 = 0
result = []
while p1 < len(list1) and p2 < len(list2):
    if list1[p1] == list2[p2]:
        result.append(list1[p1])
        p1 += 1
        p2 += 2
    elif list1[p1] > list2[p2]:
        p2 += 1
    else:
        p1 += 1
print(result)

i have worked out a full solution for deep intersection

def common_items_dict(d1, d2, use_set_for_list_commons=True, use_set_for_dict_key_commons=True, append_empty=False):
    result = {}
    if use_set_for_dict_key_commons:
        shared_keys=list(set(d1.keys()).intersection(d2.keys())) # faster, order not preserved
    else:
        shared_keys=common_items_list(d1.keys(), d2.keys(), use_set_for_list_commons=False)

    for k in  shared_keys:
        v1 = d1[k]
        v2 = d2[k]
        if isinstance(v1, dict) and isinstance(v2, dict):
            result_dict=common_items_dict(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
            if len(result_dict)>0 or append_empty:
                result[k] = result_dict 
        elif isinstance(v1, list) and isinstance(v2, list):
            result_list=common_items_list(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
            if len(result_list)>0 or append_empty:
                result[k] = result_list 
        elif v1 == v2:
            result[k] = v1
    return result

def common_items_list(d1, d2, use_set_for_list_commons=True, use_set_for_dict_key_commons=True, append_empty=False):
    if use_set_for_list_commons: 
        result_list= list(set(d2).intersection(d1)) # faster, order not preserved, support only simple data types in list values
        return result_list

    result = []
    for v1 in d1: 
        for v2 in d2:
            if isinstance(v1, dict) and isinstance(v2, dict):
                result_dict=common_items_dict(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
                if len(result_dict)>0 or append_empty:
                    result.append(result_dict)
            elif isinstance(v1, list) and isinstance(v2, list):
                result_list=common_items_list(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
                if len(result_list)>0 or append_empty:
                    result.append(result_list)
            elif v1 == v2:
                result.append(v1)
    return result


def deep_commons(v1,v2, use_set_for_list_commons=True, use_set_for_dict_key_commons=True, append_empty=False):
    """
    deep_commons
     returns intersection of items of dict and list combinations recursively

    this function is a starter function, 
    i.e. if you know that the initial input is always dict then you can use common_items_dict directly
    or if it is a list you can use common_items_list directly

    v1 - dict/list/simple_value
    v2 - dict/list/simple_value
    use_set_for_dict_key_commons - bool - using set is faster, dict key order is not preserved 
    use_set_for_list_commons - bool - using set is faster, list values order not preserved, support only simple data types in list values
    append_empty - bool - if there is a common key, but no common items in value of key , if True it keeps the key with an empty list of dict

    """

    if isinstance(v1, dict) and isinstance(v2, dict):
        return common_items_dict(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
    elif isinstance(v1, list) and isinstance(v2, list):
        return common_items_list(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
    elif v1 == v2:
        return v1
    else:
        return None


needed_services={'group1':['item1','item2'],'group3':['item1','item2']}
needed_services2={'group1':['item1','item2'],'group3':['item1','item2']}

result=deep_commons(needed_services,needed_services2)

print(result)

list1=[123,324523,5432,311,23]
list2=[2343254,34234,234,322123,123,234,23]
common=[]
def common_elements(list1,list2):
    for x in range(0,len(list1)):
        if list1[x] in list2:
            common.append(list1[x])
            
common_elements(list1,list2)
print(common)

Use a generator:

common = (x for x in list1 if x in list2)

The advantage here is that this will return the generator in constant time (nearly instant) even when using huge lists or other huge iterables.

For example,

list1 =  list(range(0,10000000))
list2=list(range(1000,20000000))
common = (x for x in list1 if x in list2)

All other answers here will take a very long time with these values for list1 and list2.

You can then iterate the answer with

for i in common: print(i)

继续阅读：list python

Common elements comparison between 2 lists

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？