Is it a good idea to generate and store a random number as a hash value for my class?

2023-02-12 12:12 问答作者：

I have a clas开发者_开发知识库s that essentially wraps a list, and lists apparently can't have hash values. My idea was to generate a random number and store that as the hash value.

If you have two instances that are equivalent then they must return the same __hash__ value. If you generate one randomly you can't guarantee this will be the case, so you will get strange behaviour.

Can you use a tuple instead of a list? Can you just ignore the list in the computation of the hash? It's OK if non-equal objects have a collision.

You should see DictionaryKeys about why lists aren't allowed to have hash values.

Not a good idea. The general contract of hash code is that if Object A equals Object B, A.hashCode() equals B.hashCode(). With what you're proposing this wont hold.

You could try using

list length as the hash code
hash code of the first item in the list as the hashcode
sum of all hashcodes of all items as the hashcode

or something else along these lines.

As mentioned by others, a if 2 objects are considered "equal" then they must have the same hash value.

How you define equals for your class is up to you. If you only care about referential equality then you could generate a random number on __init__, but it better be the case that MyWrapper([1,2,3]) == MyWrapper([1,2,3]) is False.

The reason you shouldn't use the contents of the list as a hash like @iluxa suggests is because if you use your class as a key in a dictionary, then change the contents such that the hash value changes, it would not be able to find that key in the dictionary because it stored the old hash value and is trying to look up the new one.

To sum up:

If a == b then hash(a) == hash(b) must be true.
If a != b then hash(a) != hash(b) should be true most of the time for performance of lookups, but it is not required to be the case.
The value of a hash should not change between the time of adding it to a dict (or any other hash-based lookup structure that doesn't recalculate hash values on look-up) and trying to find it.

The last part is often just simplified to the value of the hash should not change ever.

It is reasonable to support hashing for mutable containers - but the general rule is that you're hashing the current value, not the container.

EDIT That's reasonable in principle, though Python generally guards against it.

There's even a specific hashing technique for this kind of task. Zobrist hashing calculates a hash incrementally, based on each change to the container, in such a way that no matter how you reach a particular value you get the same hash. This is used in chess programs to detect cases where the board reaches the same state through different sequences of moves, as an optimisation for the game tree searching.

The question here is why do you want to find a list using a hash-based lookup?

If the value varies, but you want to find the same container anyway, then this may work (with collision issues) - but it still seems pointless. The random hash value is just a kind of identifier, identifying which object you're referring to. The reference/pointer to the object is a better identifier. And how will you know which hash value to search for if you don't already have a reference to the object anyway?

If you want a collection of containers, a list of lists is a better solution than a dictionary of randomly-hashed lists.

EDIT A possible exception is if you want a collection of lists where the same list may be added more than once, but will only be present once in the container - a set of list instances. In that case, the hash should IMO be based on the reference to the object, but I can't remember how that's done.

继续阅读：hash python

Is it a good idea to generate and store a random number as a hash value for my class?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？