Python - When duplicate items are identified, which object does a set or frozenset take?
I have a user-defined class MyClass
that has a __hash__
and __eq__
implementation that ensures that, for example:
>>> a = MyClass([100, 99, 98, 97])
>>> b = MyClass([99, 98, 97, 100])
>>> a.__hash__() == b.__hash__()
True
>>> a == b
True
Question: if I do the following:
>>> x = [a, b]
>>> set(x)
can I count on set
keeping a
? Is the set __init__
iterating through x
in order? Or do I need to worry abo开发者_Go百科ut it taking b
randomly?
Thanks,
Mike
In these cases of hash-based things, it uses both __hash__
and __eq__
.
If __hash__
and __eq__
are both the same, then the first one it gets to in the iterable is taken. When it gets to the next, it checks if it already has it and decides yes.
>>> class Same(object):
... def __init__(self, value):
... self.value = value
... def __hash__(self):
... return 42
... def __eq__(self, other):
... return True
... def __repr__(self):
... return 'Same(%r)' % self.value
>>> set([Same(2), Same(1)])
set([Same(2)])
>>> set([Same(1), Same(2)])
set([Same(1)])
With a dict
, it becomes more interesting:
>>> {Same(1): 1, Same(2): 2}
{Same(1): 2}
>>> {Same(1): 2, Same(2): 1}
{Same(1): 1}
>>> {Same(2): 1, Same(2): 2}
{Same(2): 2}
>>> {Same(2): 2, Same(2): 1}
{Same(2): 1}
>>> {Same(2): 2, Same(2): 1}
{Same(2): 1}
You should be able to guess what's happening here. It stores the first item, then the hash/equality of the second is the same; however, it's got a different value, so it stores that. The value is overwritten always, whether they match or not:
>>> {Same(1): Same(2), Same(3): Same(4)}
{Same(1): Same(4)}
I hope that helps.
set
(and dict
) check not only the equality of the hashes, but also the equality of the objects themselves into account.
I believe that set() requires both hash and eq to be overridden. In this case, you could have hash(a) == hash(b) but still have a != b, assuming you defined eq in such a fashion
精彩评论