hash table question

2023-02-17 15:38 问答作者：

When an interviewer asked me what were disadvantage of hash table. He hinted me that hash table takes a lot of space from initialization. That means, we need to pre-allocate memory for hash table (bucket). Even we don't actually need that much memory, we d开发者_高级运维on't have that many entries.

Is this reasonable?

Coz I checked wikipedia, no this disadvantage discussed in the article.

Thanks!

It depends on the implementation. One way to implement a hash table is to make the initial table not so big, and if the load factor (ratio of used elements vs available slots) increases beyond a threshold, increase the table size (there are a few ways of doing this, all detailed in that wikipedia article you discussed).

The situation you mention could certainly be possible given some conditions (big initial table size, very few elements inserted), but it would most likely be the result of a poor data structure choice.

Depending on how you implement the hash table and how many buckets there are initially, it could be a reasonable disadvantage. Hash tables need about half (or more) of the buckets to be empty or else collisions become much more likely. All of the buckets are empty initially, but figure that after adding items to the hash table, most implementations will expand the number of buckets so at least half are free. This means you have O(n) empty buckets. Whether or not this matters depends on how many items you have and how big the buckets are. If the buckets are structs, they could potentially be quite large as they would need to store a hash value along pointers to the key and value (if not the actual key and value). More commonly, the buckets are pointers to containers that store the hash and pointers to the key and value. The size of each bucket then depends on the size of a pointer. This would almost always be 32- or 64-bits (unless you're using an embedded processor).

So assuming the best case of 4 bytes per bucket, you'd end up using 4 megabytes of memory for a hash table with 500,000 objects (remember: around half of the buckets are empty). Also figure each of those half million used buckets has a node with pointers to the actual data. That would use another 12-bytes per value (although with memory alignment constraints, it's more like 16 bytes). That would be another 8MB not including any of the actual data!

On the other hand, most data structures have a large memory overhead. A binary search tree has four pointers per node (one for the key, one for the value, and two for child nodes). At 16-bytes per node on a 32-bit system, this is comparable to that of a hash table (at least within an order of magnitude).

If all you are storing is chars, the overhead of any of these data structures could be large compared to the actual data, but in practice, it shouldn't be too much of a problem unless working with gigantic data sets and horribly inefficient hash table implementations.

继续阅读：hashtable

hash table question

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？