Hash Table Bucket Array
If I have 50,000 entries and have say, 100,000 slots available in a hash table. What would be the best way to choose a suitable bucket array size for each index if not using LinkedLists so th开发者_如何学JAVAat the array would never 'overflow'? Would 30% excess be suitable?
If you are using a fixed size array for your buckets, then there is no bucket size less than 50,000 that can guarantee never overflowing unless you have additional information about the distribution of keys in the 50,000 (i.e if you knew that they were the integers 1 .. 50,000 then it would be trivial).
But generally you don't want to rely on large buckets because that is O(n) to search the buckets. Instead it's a better idea to use a variable sized table and variable sized buckets. The buckets can simply be arrays that you double in size each time they get filled. Similiarly the hash table can be doubled in size each time you get 90% full. This is a standard type approach.
As mentioned by the previous posters, most implementations of lists whether by arrays or linked lists automatically reallocate storage for you when the list gets full.
Some languages support dynamic size for array (no need to declare the size of the array). Data decided the size of array dynamically.And the languages which needs the size they also support dynamic array.
If you know the keys a priori, you can compute a minimal perfect hash. Therefore, a bucket size of one is sufficient if you know the keys and can tailor the hash function.
If you do not know the keys in advance -- or do know the keys, but do cannot alter the hash function -- then it is possible for an adversary to select a worst case set of keys (i.e., keys that all hash to the same bucket). To guarantee no overflow of buckets then, you would need a bucket size equal to the number of buckets. If you are willing to tolerate the chance of overflow, it may be possible to do more sophisticated analysis to select a bucket size that covers a preponderance of situations.
精彩评论