Java HashSet using a specified method
I have a basic class 'HistoryItem' like so:
public class HistoryItem
private Date startDate;
private Date endDate;
private Info info;
private String details;
@Override
public int hashCode() {
int hash = (startDate == null ? 0 : startDate.hashCode());
hash = hash * 31 + (endDate == null ? 0 : endDate.hashCode());
return hash;
}
}
I am currently using a HashSet to remove duplicates from an ArrayList on the startDate & endDate fields开发者_JAVA技巧, which is working correctly.
However I also need to remove duplicates on different fields (info & details).
My question is this.
Is there a way to specify a different method which HashSet will use in place of hashCode()? Something like this:public int hashCode_2() {
int hash = (info == null ? 0 : info.hashCode());
hash = hash * 31 + (details == null ? 0 : details.hashCode());
return hash;
}
Set<HistoryItem> removeDups = new HashSet<HistoryItem>();
removeDups.setHashMethod(hashCode_2);
Or is there another way that I should be doing this?
You can make a wrapper class around HistoryItem
with a different GetHashCode
implementation, then make a HashSet of wrappers around each item in the original set.
A couple things. First and foremost, you MUST override equals() if you are going to override hashCode(). This is important. Second, if you are dealing with different fields, then you should probably have a different HashSet for each field. So you can iterate over the Map like so:
HashSet<String> info;
HashSet<String> details;
for (HistoryItem h:map){
if(info.contains(h.getInfo()){
// this is a dup
}
if (details.contains(h.getDetails()){
// this is a dup
}
info.add(h.getInfo());
details.add(h.getDetails());
}
I ended up using GNU Trove for this.
Minimal code change was required.
A new class implementing TObjectHashingStrategy (containing HashCode
and Equals
methods).
public class HistoryItemDuplicateInfo
implements TObjectHashingStrategy<HistoryItem> {
@Override
public int computeHashCode(HistoryItem obj) {
...
}
@Override
public boolean equals(HistoryItem arg0, HistoryItem arg1) {
...
}
}
Then use the THashSet object with a specified strategy for removing the duplicates.
THashSet<HistoryItem> hs = new THashSet<HistoryItem>(new HistoryItemDuplicateInfo());
Hope this is able to help someone out in future.
You could remove the duplicates using a java.util.TreeSet
with a custom Comparator
that takes your Info
and Details
into account.
I would suggest you;
- use long for a date instead of a Date object.
- use just a Set if you want to avoid duplicates. Why are you using a List at all? If you need to retain a order using a SortedSet like TreeSet or a Set which retains order like LinkedHashSet.
- Can your HistoryItem be valid will null fields? Can you structure your fields so they are never null?
- Fields which make up hashCode/equals/compareTo should be immutable. Can those fields be final? If not, why not?
HashSet
is hardcoded to use hashCode()
and equals()
. You could implement your own HashSet
-like class, possibly by ruthlessly duplicating Java's own source code, but that's plain ugly, contradicts any decent set of software development rules, and is possibly illegal with regards to Java's source code license (this depends on the actual JDK, e.g. Sun/Oracle's JDK vs OpenJDK).
You can do things with TreeSet
, though. TreeSet
normally uses the compareTo()
method of the elements, not the hashCode()
or equals()
. Moreover, a TreeSet
instance can be built with a custom Comparator
instance, which is then invoked to do comparisons, making you free to have your own rules. A compareTo()
method (or a Comparator.compare()
method) must implement an order, which may be a bit trickier than a simple hashCode()
-and-equals()
, but this usually not hard either. TreeSet
is sometimes said to be slower than HashSet
, but the actual difference is slight and it takes a very specific situation to actually be able to notice that difference in any way.
Conceptually, there could be a hash equivalent of Comparator
for HashSet
: an interface HasherAndEqualizer
with int hashCode(Object obj)
and boolean equals(Object obj1, Object obj2)
methods. Sun did not see it fit to include such an interface, I do not know why. Possibly they did not think it would be useful. The "GNU Trove" library that you cite in another answer provides such an interface.
Alternatively, you can always use wrappers. Instead of storing HistoryItem
instances in your secondary set, you can store HistoryItemWrapper
instances, each linking to an actual HistoryItem
and providing the hashCode()
/equals()
methods you need for that set.
精彩评论