开发者

How is membership testing different for a list and a set?

I'm having trouble with figuring out why the first of these assertions is OK and the second raises an error.

subject_list = [Subject("A"), Subject("B"), Subject("C")]
subject_set = set()
subject_set.add(Subject("A"))
subject_set.add(Subject("B"))
subject_set.add(Subject("C"))

self.assertIn(Subject("A"), subject_list)
self.assertIn(Subject("A"), subject_set)

Here is the error:

Traceback (most recent call last):
  File "C:\Users\...\testSubject.py", line 34, in testIn
    self.assertIn(Subject("A"), subject_set)
AssertionError: <Subject: A> not found in set([<Subject: B>, <Subject: C>, <Subject: A>])

The test for equality in the Subject class is simply self.name == other.name, and in another UnitTest I verify that Subject("A") == Subject("A") . I really can't figure out why the subject is in the list and not in the set. I开发者_运维技巧deally I'd like the subject to be in both.


The expression

Subject("A") in subject_list

will compare Subject("A") to each entry in subject_list using the Subject.__eq__() method. If this method is not overwritten, it defaults to always return False unless the two operands are the same object. The above expression would always return False if Subject lacked a __eq__() method, since Subject("A") is a new instance which cannot already be in the list.

The expression

Subject("A") in subject_set

on the contrary will use Subject.__hash__() first to find the right bucket, and use Subject.__eq__() only after this. If you did not define Subject.__hash__() in a way compatible with Subject.__eq__(), this will fail.


Membership in a set also depends on the object's hash, and as such you must implement the __hash__() method on the class appropriately.


Either you don't have a __hash__() method in your Subject class, or it is dodgy. Try this:

def __hash__(self):
    return hash(self.name)

The docs are here.


To use these in a set, you have to make sure Subject is properly hashable. If you do not define __hash__ yourself, it will simply take the id, and that is different for different instances. __hash__ should be defined such that equal objects have equal hashes.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜