Compare Dictionary Values for each key in the same Dictionary in Python
Update:
Hello again. My question is, how can I compare values of an dictionary for equality. More Informationen about my Dictionary:
- keys are session numbers
values of each key are nested lists -> f.e.
[[1,0],[2,0],[3,1]]
the length of values for each key arent the same, so it could be that session number 1 have more values then session number 2
- here an example dictionary:
order_session = {1:[[100,0],[22,1],[23,2]],10:[100,0],[232,0],[10,2],[11,2]],22:[[5,2],[23,2],....], ... }
My Goal:
Step 1: to compare the values of session number 1 with the values of the whole other session numbers in the dictionary for equality
Step 2: take the next session number and compare the values with the other values of the other session numbers, and so on - finally we have each session number value compared
Step 3: save the result into a list f.e. output = [[100,0],[23,2], ... ] or output = [(100,0),(23,2), ... ]
- if you can see a value-pair 开发者_运维技巧[100,0] of session 1 and 10 are the same. also the value-pair [23,2] of session 1 and 22 are the same.
Thanks for helping me out.
Update 2
Thank you for all your help and tips to change the nested list of lists into list of tuples, which are quite better to handle it.
I prefer Boaz Yaniv solution ;) I also like the use of collections.Counter() ... unlucky that I use 2.6.4 (Counter works at 2.7) maybe I change to 2.7 sometimes.
If your dictionary is long, you'd want to use sets, for better performance (looking up already-encountered values in lists is going to be quite slow):
def get_repeated_values(sessions):
known = set()
already_repeated = set()
for lst in sessions.itervalues():
session_set = set(tuple(x) for x in lst)
repeated = (known & session_set) - already_repeated
already_repeated |= repeated
known |= session_set
for val in repeated:
yield val
sessions = {1:[[100,0],[22,1],[23,2]],10:[[100,0],[232,0],[10,2],[11,2]],22:[[5,2],[23,2]]}
for x in get_repeated_values(sessions):
print x
I also suggest (again, for performance reasons) to nest tuples inside your lists instead of lists, if you're not going to change them on-the-fly. The code I posted here will work either way, but it would be faster if the values are already tuples.
There's probably a nicer and more optimal way to do this, but I'd work my way from here:
seen = []
output = []
for val in order_session.values():
for vp in val:
if vp in seen:
if not vp in output:
output.append(vp)
else:
seen.append(vp)
print(output)
Basically, what this does is to look through all the values, and if the value has been seen before, but not output before, it is appended to the output.
Note that this works with the actual values of the value pairs - if you have objects of various kinds that result in pointers, my algorithm might fail (I haven't tested it, so I'm not sure). Python re-uses the same object reference for "low" integers; that is, if you run the statements a = 5
and b = 5
after each other, a
and b
will point to the same integer object. However, if you set them to, say, 10^5, they will not. But I don't know where the limit is, so I'm not sure if this applies to your code.
>>> from collections import Counter
>>> D = {1:[[100,0],[22,1],[23,2]],
... 10:[[100,0],[232,0],[10,2],[11,2]],
... 22:[[5,2],[23,2]]}
>>> [k for k,v in Counter(tuple(j) for i in D.values() for j in i).items() if v>1]
[(23, 2), (100, 0)]
If you really really need a list of lists
>>> [list(k) for k,v in Counter(tuple(j) for i in D.values() for j in i).items() if v>1]
[[23, 2], [100, 0]]
order_session = {1:[[100,0],[22,1],[23,2]],10:[[100,0],[232,0],[10,2],[11,2]],22:[[5,2],[23,2],[80,21]],}
output = []
for pair in sum(order_session.values(), []):
if sum(order_session.values(), []).count(pair) > 1 and pair not in output:
output.append(pair)
print output
...
[[100, 0], [23, 2]]
精彩评论