is it incorrect to define an hashcode of an object as the sum, multiplication, whatever, of all class variables hashcodes?

2022-12-29 02:36 问答作者：

Let's say I have the following class:

class ABC {
    private int myInt = 1;
    private double myDouble = 2;
    private String myString = "123";
    private SomeRandomClass1 myRandomClass1 = new ...
    private SomeRandomClass2 myRandomClass2 = new ...

    //pseudo code
    public int myHashCode() {
        return 37 *
               myInt.hashcode() *
               myDouble.hashCode() *
      开发者_Python百科         ... *
               myRandomClass.hashcode()
    }
}

Would this be a correct implementation of hashCode? This is not how I usually do it(I tend to follow Effective Java's guide-lines) but I always have the temptation to just do something like the above code.

Thanks

It depends what you mean by "correct". Assuming that you're using the hashCode() of all the relevant equals()-defining fields, then yes, it's "correct". However, such formulas probably will not have a good distribution, and therefore would likely cause more collisions than otherwise, which will have a detrimental effect on performance.

Here's a quote from Effective Java 2nd Edition, Item 9: Always override hashCode when you override equals

While the recipe in this item yields reasonably good hash functions, it does not yield state-of-the-art hash functions, nor do Java platform libraries provide such hash functions as of release 1.6. Writing such hash functions is a research topic, best left to mathematicians and computer scientists. [...Nonetheless,] the techniques described in this item should be adequate for most applications.

It may not require a lot of mathematical power to evaluate how good your proposed hash function is, but why even bother? Why not just follow something that has been anecdotally proven to be adequate in practice?

Josh Bloch's recipe

Store some constant nonzero value, say 17, in an int variable called result.
Compute an int hashcode c for each field:
- If the field is a boolean, compute (f ? 1 : 0)
- If the field is a byte, char, short, int, compute (int) f
- If the field is a long, compute (int) (f ^ (f >>> 32))
- If the field is a float, compute Float.floatToIntBits(f)
- If the field is a double, compute Double.doubleToLongBits(f), then hash the resulting long as in above.
- If the field is an object reference and this class's equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If the value of the field is null, return 0.
- If the field is an array, treat it as if each element is a separate field. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5.
Combine the hashcode c into result as follows: result = 31 * result + c;

Now, of course that recipe is rather complicated, but luckily, you don't have to reimplement it every time, thanks to java.util.Arrays.hashCode(Object[]) (and com.google.common.base.Objects provides a convenient vararg variant).

@Override public int hashCode() {
    return Arrays.hashCode(new Object[] {
           myInt,    //auto-boxed
           myDouble, //auto-boxed
           myRandomClass,
    });
}

Object.hashCode()

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

Doing this sort of things is allowable by the contract. But so is always returning 1. There is a compile time flag in HotSpot to always return 1 for the identity hash vale. Such choices will, however, produce poor performance.

There is a particular problem with multiplication. Not only will a 0 hash value from a component annihilate the value, but powers of two will progressively zero the lower bits.

Commutative operators have the problem that rearrangements of values will cause a clash.

If there is a particular relationship between hash values of components, then addition will be particularly bad. (4, 6) and (2, 8) clashing, for instance.

No, but in practice it is almost certainly not a good idea. Most importantly you are not allowed to modify any of the fields that you use in the hash code. They all have to be constant.

If you modify one, this can happen: You insert the objecy in a HashSet, you change a fields, and then test if the object is in the HashSet. Although it is in there, due to the fact that the hash code changed, HashSet will not find it!

I'm probably much too late to this party, but.....

The problem with your hash function can be described by the commutative law of multiplication. - khan academy link

That is - a * b * c == c * b * a.

So by only multiplying you remove the uniqueness of the order. This means you are much more likely to get collisions.

Seems to me that unless you can guarantee that the product is a prime number you might get collision (though probably rare) between resulting hashcodes for an object

继续阅读：.net hashcode

is it incorrect to define an hashcode of an object as the sum, multiplication, whatever, of all class variables hashcodes?

Josh Bloch's recipe

See also

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Josh Bloch's recipe

See also

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？