Python: `key not in my_dict` but `key in my_dict.keys()`
I have a weird situation. I have a dict, self.containing_dict
. Using the debug probe, I see that dict's contents and I can see that self
is a key of it. But look at this:
>>> self in self.containing_dict
False
>>> self in self.containing_dict.keys()
True
>>> self.co开发者_StackOverflow社区ntaining_dict.has_key(self)
False
What's going on?
(I will note that this is in a piece of code which gets executed on a weakref callback.)
Update: I was asked to show the __hash__
implementation of self
. Here it is:
def __hash__(self):
return hash(
(
tuple(sorted(tuple(self.args))),
self.star_args,
tuple(sorted(tuple(self.star_kwargs)))
)
)
args = property(lambda self: dict(self.args_refs))
star_args = property(
lambda self:
tuple((star_arg_ref() for star_arg_ref in self.star_args_refs))
)
star_kwargs = property(lambda self: dict(self.star_kwargs_refs))
The problem you describe can only be caused by self
having implemented __eq__
(or __cmp__
) without implementing an accompanying __hash__
. If you didn't implement a __hash__
method, you should do so -- normally you can't use objects that define __eq__
but not __hash__
as dict keys, but if you inherit a __hash__
that may slip by.
If you do implement __hash__
, you have to make sure it acts the right way: the result must not change over the lifetime of the object (or at least as long as the object is in use as a dict key or set item), and it must be consistent with __eq__
. An object's hash value must be the same as objects it's equal to (according to its __eq__
or __cmp__
.) An object's hash value may be different from objects it's not equal to, but it doesn't have to be. The requirements also mean you can not have the result of __eq__
change over the lifetime of the object, which is why mutable objects usually can't be used as dict keys.
If your __hash__
and __eq__
are not matched up, Python won't be able to find the object in dicts and sets, but it will still show up in dict.keys()
and list(set)
, which is what you're describing here. The usual way to implement __hash__
methods is by returning the hash()
of whatever attributes you use in your __eq__
or __cmp__
method.
Judging from your __hash__
method, the class stores references to its arguments, and uses that as a hash. The problem is, those arguments are shared with the code that constructed the object. If they change the argument, the hash will change and you won't be able to find the object in any dictionaries it had been in.
The arguments need not be anything complicated, just a simple list will do.
In [13]: class Spam(object) :
....: def __init__(self, arg) :
....: self.arg = arg
....: def __hash__(self) :
....: return hash(tuple(self.arg,))
In [18]: l = range(5)
In [19]: spam = Spam(l)
In [20]: hash(spam)
Out[20]: -3958796579502723947
If I change the list that I passed as an argument, the hash will change.
In [21]: l += [10]
In [22]: hash(spam)
Out[22]: -6439366262097674983
Since dictionary keys are organized by hash, when I do x in d
, the first thing Python does is compute the hash of x, and look in the dictionary for something with that hash value. The problem is, when the hash of an object changes after being put in the dictionary, Python will look at the new hash value, and not see the desired key there. Using the list of keys, forces Python to check each key by equality, bypassing the hash check.
Most likely you have custom hash and comparison defined for whatever class self
is an instance and you mutated self
after you added it to the dictionary.
If you use a mutable object as a dictionary key then after you mutate it you may not be able to access it in the dictionary but it will still appear in the keys()
result.
精彩评论