How to de-dupe a List of Objects?
A Rec
object has a member variable called tag
which is a String
.
If I have a List
of Rec
s, how could I de-dupe the list based on the tag
member variable?
I just need to make sure that the List
contains only one Rec
with each tag
value.
Something like the following, but I'm not sure what's the best algorithm to keep track counts, etc:
private List<Rec> deDupe(List<Rec> recs) {
for(Rec rec : recs) {
// How to check whether rec.tag e开发者_如何学JAVAxists in another Rec in this List
// and delete any duplicates from the List before returning it to
// the calling method?
}
return recs;
}
Store it temporarily in a HashMap<String,Rec>
.
Create a HashMap<String,Rec>
. Loop through all of your Rec
objects. For each one, if the tag
already exists as a key in the HashMap
, then compare the two and decide which one to keep. If not, then put it in.
When you're done, the HashMap.values()
method will give you all of your unique Rec
objects.
Try this:
private List<Rec> deDupe(List<Rec> recs) {
Set<String> tags = new HashSet<String>();
List<Rec> result = new ArrayList<Rec>();
for(Rec rec : recs) {
if(!tags.contains(rec.tags) {
result.add(rec);
tags.add(rec.tag);
}
}
return result;
}
This checks each Rec
against a Set
of tags. If the set contains the tag already, it is a duplicate and we skip it. Otherwise we add the Rec
to our result and add the tag to the set.
This becomes easier if Rec
is .equals
based on its tag
value. Then you could write something like:
private List<Rec> deDupe( List<Rec> recs )
{
List<Rec> retList = new ArrayList<Rec>( recs.size() );
for ( Rec rec : recs )
{
if (!retList.contains(rec))
{
retList.add(rec);
}
}
return retList;
}
I would do that with the google collections. You can use the filter function, with a predicate that remember previous tags, and filters out Rec's with tag that has been there before. Something like this:
private Iterable<Rec> deDupe(List<Rec> recs)
{
Predicate<Rec> filterDuplicatesByTagPredicate = new FilterDuplicatesByTagPredicate();
return Iterables.filter(recs, filterDuplicatesByTagPredicate);
}
private static class FilterDuplicatesByTagPredicate implements Predicate<Rec>
{
private Set<String> existingTags = Sets.newHashSet();
@Override
public boolean apply(Rec input)
{
String tag = input.getTag();
return existingTags.add(tag);
}
}
I slightly changed the method to return Iterable instead of List, but ofcourse you change that if that's important.
If you don't care about shuffling the data around (i.e you have a small list of small objects), you can do this:
private List<T> deDupe(List<T> thisListHasDupes){
Set<T> tempSet = new HashSet<T>();
for(T t:thisListHasDupes){
tempSet.add(t);
}
List<T> deDupedList = new ArrayList<T>();
deDupedList.addAll(tempSet);
return deDupedList;
}
Remember that implmenations of Set are going to want a consistent and valid equals operator. So if you have a custom object make sure that's taken care of.
精彩评论