Why HashSet internally implemented as HashMap [duplicate]
Possible Duplicate:
Why does HashSet implementation in 开发者_如何学CSun Java use HashMap as its backing?
I know what a hashset and hashmap is - pretty well versed with them. There is 1 thing which really puzzled me.
Example:
Set <String> testing= new HashSet <String>();
Now if you debug it using eclipse right after the above statements, under debugger variables tab, you will noticed that the set 'testing' internally is implemented as a hashmap.
Why does it need a hashmap since there is no key,value pair involved in sets collection
It's an implementation detail. The HashMap
is actually used as the backing store for the HashSet
. From the docs:
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
(emphasis mine)
The answer is right in the API docs
"This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important."
So you don't even need the debugger to know this.
In answer to your question: it is an implementation detail. It doesn't need to use a HashMap, but it is probably just good code re-use. If you think about it, in this case the only difference is that a Set has different semantics from a Map. Namely, maps have a get(key) method, and Sets do not. Sets do not allow duplicates, Maps allow duplicate values, but they must be under different keys.
It is probably really easy to use a HashMap as the backing of a HashSet, because all you would have to do would be to use hashCode (defined on all objects) on the value you are putting in the Set to determine if a dupe, i.e., it is probably just doing something like
backingHashMap.put(toInsert.hashCode(), toInsert);
to insert items into the Set.
In most cases the Set is implemented as wrapper for the keySet() of a Map. This avoids duplicate implementations. If you look at the source you will see how it does this.
You might find the method Collections.newSetFromMap() which can be used to wrap ConcurrentHashMap for example.
The very first sentence of the class's Javadoc states that it is backed by a HashMap:
This class implements the Set interface, backed by a hash table (actually a HashMap instance).
If you'll look at the source code of HashSet you'll see that what it stores in the map is as the key is the entry you are using, and the value is a mere marker Object
(named PRESENT
).
Why is it backed by a HashMap? Because this is the simplest way to store a set of items in a (conceptual) hashtable and there is no need for HashSet to re-invent an implementation of a hashtable data structure.
It's just a matter of convenience that the standard Java class library implements HashSet
using a HashMap
-- they only need to implement one data structure and then HashSet
stores its data in a HashMap
with the actual set objects as the key and a dummy value (typically Boolean.TRUE
) as the value.
HashMap has already all the functionality that HashSet requires. There would be no sense to duplicate the same algorithms.
it allows you to easily and quickly determine whether an object is already in the set or not.
精彩评论