Rationale in selecting Hash Key type

2022-12-26 23:07 问答作者：

Guys, I have a data structure which has 25 distinct keys (integer) and a value. I have a list of these objects (say 50000) and I intend to use a hash table to store/retrieve them. I am planning to take one of these approaches.

Create a integer hash from these 25 integ开发者_如何学Cer keys and store it on a hash table. (Yeah! I have some means to handle collisions)
Make a string concatenation on the individual keys and use it as a hash key for the hash table. For example, if the key values are 1,2,4,6,7 then the hash key would be "12467".

Assuming that I have a total of 50000 records each with 25 distinct keys and a value, then will my second approach be a overkill when it comes to the cost of string comparisons it needs to do to retrieve and insert a record?

Some more information!

Each bucket in the hash table is a balanced binary tree.
I am using the boost library's hash_combine method to create the hash from the 25 keys.

Absolutely use the first method, because if you use the second , you will require a hash table which has 1x10^(25m), where x is the maximum length of a key slots available.

For example, if the maximum number a key can be is 9999, m would be 4 and you'd need 1x10^100 slots in your table.

Explanation:

The idea behind a hash table is that you can randomly access any element with an efficiency of O(1) (collisions aside) because any element's hash is infact its position in the hash table. So for example, if I hash Object X and a hash of 24 is returned (or some string hash which is converted to a number, which turns out to be 24), I simply go to slot 24 of my table (often implemented as an array), and can retrieve Object X.

But if you were using your second method (concatenating 25 numbers - we'll say digits to simplify things here - together to make the hash), the largest hash would be 9999999999999999999999999. Therefore to retrieve that object from the hash table, you'd have to retrieve it from position 9999999999999999999999999 - which means your table must have at least that many spots.

And remember, with the first one - since you're using a binary tree, collisions won't really be that big a deal. Worst case scenario will be a retrieval/insertion efficiency of O(log(n)) which isn't really that bad anyways.

继续阅读：hashtable integer key string

Rationale in selecting Hash Key type

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？