Using char[] array in HashMap in java
I am using a HashMap with a ch开发者_高级运维ar array as the key. But when I put the key, value into the HashMap and print it the value printed as the key is some strange value. If I use a string instead, it is working fine. Please let me know how to make this work for char[] values.
while((ch = br.read()) != -1) {
s = new char[10];
s[i++] = (char)ch;
while( (i < 10) && ((ch = br.read()) != -1)) {
s[i++] = (char)ch;
}
//System.out.println(s);
hmap.put(s, 1);
//System.out.println(hmap);
i = 0;
}
Key: BZOUA1578L Value: 1
The contents in the hashmap are {[C@31ff23=1} instead of {BZOUA1578L, 1}
I have a huge GBs of data to be read and if I use String, I am running out of the heap size.
You will have to think of a different way then.
You could use consider using a
TreeMap<char[], V>
with a customComparator
... but this is a stop-gap measure.You could implement a custom hash table-based
Map
class that usedchar[]
as the key, but did the array hashing and equality tests without using the key object'sequals(Object)
andhashcode()
methods. This also is a stop-gap measure.You could get a bigger machine ... another stop-gap measure.
You could redesign the algorithm so that it doesn't have to put all of the data into a big in-memory hash table in the address space of one Java program.
What you are currently doing doesn't scale. Even if you manage to implement a map using char[]
keys instead of String
keys, you probably only reduce the space used to hold the keys by half. The best this is going to give you is the ability to handle roughly twice the number of keys you can currently handle. Then you hit the wall again.
In the long term, the last option is the best if you want to keep scaling up.
Incidentally @Sean Patrick Floyd's suggestion of interning the key strings will probably only make matters worse. First, you won't ever get Strings that are equal but not == as keys in the same map. So interning doesn't save anything. Second, interning is performed using a private hash table, and the JVM needs to allocate space to represent this table.
The only scenario where interning is potentially worthwhile is if the strings that you are using to to do map lookups are likely to survive multiple GC cycles. Only then could interning save space.
Finally, there's one scenario which may cause your key strings to use far more memory than you realize. Consider this:
BufferedReader br = ...
Map<String, Value> map = new HashMap<String, Value>();
String line;
while ((line = br.readLine()) != null) {
...
String key = line.substring(...);
map.put(key, ...);
}
The problem is that the substring
method shares the same backing char[]
as the original String. If key
is long-lived (which it probably will be), this means that the original large backing array will also be long lived, even though we're only ever going to refer to a slice of that array it via the key
object.
The solution is to write this:
String key = new String(line.substring(...));
which forces the copying of the characters into a new (smaller) character array.
UPDATE - Changes to the implementation of java.lang.String
in Java 7 addressed this problem. The substring
methods now make a copy of the relevant slice of the backing array.
Printing is the least of your poblems, really. The real problem is that char[] does not have meaningful implementation of the equals()
and hashCode()
methods that HashMap
relies on. Therefore the map will not recognize different arrays with the same content as the same keys, which in all probablity defeats your purpose of using a hash map in the first place.
So you need to wrap your arrays in something that does have equals()
and hashCode()
. May I suggest wrapping them in java.lang.String
?
That's because arrays don't have a toString() method. What you are seeing is the Output of Object.toString():
return getClass().getName() + "@" + Integer.toHexString(hashCode());
In your case:
[C = internal name of char[].class
@ = "@"
31ff23 = your array's identity hash code as hey String
=1 = the value
BTW, they don't have a hashCode() implementation either, so they make bad keys for HashMaps (see this similar question about using byte arrays as HashMap keys).
char[]
's .toString()
method (which is used when printing map contents) is not implemented to display the contents of the array - it uses the default toString()
implementation which includes the class name ([C
in this case) followed by the hashcode, which is typically the (hex) memory address in the JVM
You shouldn't use arrays as keys in HashMap
, because they don't implement hashCode()
and equals()
based on the elements. Prefer String
.
You trying to use Char Array as Key, which just is a Object reference and its just is a Hashcode
value which you may not be giving while fetching data from map. so use Char[] array elements rather,
which you can keep track of, to fetch data later..
It's not possible to make this work for char arrays, because HashMap
is based on the hashCode()
and equals()
methods of the key objects. And arrays inherit the implementation of Object
, which is based on identity rather than content - and you can't override them either. What you see as the "contents" of the map is the output of toString()
, which for arrays is also based on object identity rather than contet.
Use Strings, that's what they're for. Pretty much all Java APIs are designed to work with Strings, while char arrays are rarely used directly.
精彩评论