Why HashSet internally implemented as HashMap [duplicate]

2023-02-05 04:45 问答作者：

This question already has answers here: Closed 12 years ago.

Possible Duplicate:
Why does HashSet implementation in 开发者_如何学CSun Java use HashMap as its backing?

I know what a hashset and hashmap is - pretty well versed with them. There is 1 thing which really puzzled me.

Example:

Set <String> testing= new HashSet <String>();

Now if you debug it using eclipse right after the above statements, under debugger variables tab, you will noticed that the set 'testing' internally is implemented as a hashmap.

Why does it need a hashmap since there is no key,value pair involved in sets collection

It's an implementation detail. The HashMap is actually used as the backing store for the HashSet. From the docs:

This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.

(emphasis mine)

The answer is right in the API docs

"This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.

This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important."

So you don't even need the debugger to know this.

In answer to your question: it is an implementation detail. It doesn't need to use a HashMap, but it is probably just good code re-use. If you think about it, in this case the only difference is that a Set has different semantics from a Map. Namely, maps have a get(key) method, and Sets do not. Sets do not allow duplicates, Maps allow duplicate values, but they must be under different keys.

It is probably really easy to use a HashMap as the backing of a HashSet, because all you would have to do would be to use hashCode (defined on all objects) on the value you are putting in the Set to determine if a dupe, i.e., it is probably just doing something like

backingHashMap.put(toInsert.hashCode(), toInsert);

to insert items into the Set.

In most cases the Set is implemented as wrapper for the keySet() of a Map. This avoids duplicate implementations. If you look at the source you will see how it does this.

You might find the method Collections.newSetFromMap() which can be used to wrap ConcurrentHashMap for example.

The very first sentence of the class's Javadoc states that it is backed by a HashMap:

This class implements the Set interface, backed by a hash table (actually a HashMap instance).

If you'll look at the source code of HashSet you'll see that what it stores in the map is as the key is the entry you are using, and the value is a mere marker Object (named PRESENT).

Why is it backed by a HashMap? Because this is the simplest way to store a set of items in a (conceptual) hashtable and there is no need for HashSet to re-invent an implementation of a hashtable data structure.

It's just a matter of convenience that the standard Java class library implements HashSet using a HashMap -- they only need to implement one data structure and then HashSet stores its data in a HashMap with the actual set objects as the key and a dummy value (typically Boolean.TRUE) as the value.

HashMap has already all the functionality that HashSet requires. There would be no sense to duplicate the same algorithms.

it allows you to easily and quickly determine whether an object is already in the set or not.

Why HashSet internally implemented as HashMap [duplicate]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？