How is membership testing different for a list and a set?
I'm having trouble with figuring out why the first of these assertions is OK and the second raises an error.
subject_list = [Subject("A"), Subject("B"), Subject("C")]
subject_set = set()
subject_set.add(Subject("A"))
subject_set.add(Subject("B"))
subject_set.add(Subject("C"))
self.assertIn(Subject("A"), subject_list)
self.assertIn(Subject("A"), subject_set)
Here is the error:
Traceback (most recent call last):
File "C:\Users\...\testSubject.py", line 34, in testIn
self.assertIn(Subject("A"), subject_set)
AssertionError: <Subject: A> not found in set([<Subject: B>, <Subject: C>, <Subject: A>])
The test for equality in the Subject class is simply self.name == other.name
, and in another UnitTest I verify that Subject("A") == Subject("A")
. I really can't figure out why the subject is in the list and not in the set. I开发者_运维技巧deally I'd like the subject to be in both.
The expression
Subject("A") in subject_list
will compare Subject("A")
to each entry in subject_list
using the Subject.__eq__()
method. If this method is not overwritten, it defaults to always return False
unless the two operands are the same object. The above expression would always return False
if Subject
lacked a __eq__()
method, since Subject("A")
is a new instance which cannot already be in the list.
The expression
Subject("A") in subject_set
on the contrary will use Subject.__hash__()
first to find the right bucket, and use Subject.__eq__()
only after this. If you did not define Subject.__hash__()
in a way compatible with Subject.__eq__()
, this will fail.
Membership in a set also depends on the object's hash, and as such you must implement the __hash__()
method on the class appropriately.
Either you don't have a __hash__()
method in your Subject
class, or it is dodgy. Try this:
def __hash__(self):
return hash(self.name)
The docs are here.
To use these in a set, you have to make sure Subject
is properly hashable. If you do not define __hash__
yourself, it will simply take the id
, and that is different for different instances. __hash__
should be defined such that equal objects have equal hashes.
精彩评论