开发者

Lookup table for unhashable in Python

I need to create a mapping from objects of my own custom class (derived from dict) to objects of another custom class. As I see it there are two ways of doing this:

  1. I can make the objects hashable. I'm not sure how I would do this. I know I can implement __hash__() but I'm unsure how to 开发者_StackOverflow中文版actually calculate the hash (which should be an integer).

  2. Since my objects can be compared I can make a list [(myobj, myotherobj)] and then implement a lookup which finds the tuple where the first item in the tuple is the same as the lookup key. Implementing this is trivial (the number of objects is small) but I want to avoid reinventing the wheel if something like this already exists in the standard library.

It seems to me that wanting to look up unhashables would be a common problem so I assume someone has already solved this problem. Any suggestions on how to implement __hash()__ for a dict-like object or if there is some other standard way of making lookup tables of unhashables?


Mappings with mutable objects as keys are generally difficult. Is that really what you want? If you consider your objects to be immutable (there is no way to really enforce immutability in Python), or you know they will not be changed while they are used as keys in a mapping, you can implement your own hash-function for them in several ways. For instance, if your object only has hashable data-members, you can return the hash of a tuple of all data-members as the objects hash.

If your object is a dict-like, you can use the hash of a frozenset of all key-value-pairs.

def __hash__(self):
    return hash(frozenset(self.iteritems()))

This only works if all values are hashable. In order to save recalculations of the hashes (which would be done on every lookup), you can cache the hash-value and just recalculate it if some dirty-flag is set.


A simple solution seems to be to do lookup[id(myobj)] = myotherobj instead of lookup[myobj] = myotherobj. Any commente on this approach?


The following should work if you're not storing any additional unhashable objects in your custom class:

def __hash__(self):
    return hash(self.items())


Here is an implementation of a frozendict, taken from http://code.activestate.com/recipes/414283/:

class frozendict(dict):
    def _blocked_attribute(obj):
        raise AttributeError, "A frozendict cannot be modified."
    _blocked_attribute = property(_blocked_attribute)

    __delitem__ = __setitem__ = clear = _blocked_attribute
    pop = popitem = setdefault = update = _blocked_attribute

    def __new__(cls, *args):
        new = dict.__new__(cls)
        dict.__init__(new, *args)
        return new

    def __init__(self, *args):
        pass

    def __hash__(self):
        try:
            return self._cached_hash
        except AttributeError:
            h = self._cached_hash = hash(tuple(sorted(self.items())))
            return h

    def __repr__(self):
        return "frozendict(%s)" % dict.__repr__(self)

I would replace tuple(sorted(self.items())) by frozenset(self.iteritems()) as in Spacecowboy's answer. And consider adding __slots__ = ("_cached_hash",) to the class.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜