开发者

Java: Test for duplicate Objects in a Collection

Given a List of MyClass objects (and a custom Comparitor myComparitor if needed), what good options are there for checking if the List contains two "equal" objects?

Edit: if there are duplicates, return a reference to 开发者_运维问答one or more of the duplicates.

Overriding MyClass.equals(MyClass) in this case is not an option.

My initial thought is to create a hash table of sorts, but I suspect that there's a non-hack way to accomplish the same thing:

SortedSet mySet = new TreeSet(myComparitor);

mySet.addAll(myList);

// Find duplicates in a sorted set in O(N) time

P.S. Is there a good reference on Markdown?


If the element's equals(Object) method does not give you the semantic that you require, then HashMap or HashSet are not options. Your choices are:

  • Use a TreeMap for de-duping. This is O(NlogN).
  • Sort the ArrayList or a copy, then iterate over looking for element i equals element i + 1. This is O(NlogN).
  • Find an alternative implementation of hash sets that allows you to provide a separate object to implement equality and hashing. (Neither Apache or Google collections support this, so you'll need to look further afield.)
  • Create a wrapper class for your element type that overrides equals(Object) and hashCode(), and de-dup using a HashSet of wrapped objects. This is O(N), but the constant of proportionality will be larger than a simple HashSet due to creation of wrapper objects.

When de-duping with a Set it is probably better to use a loop rather than addAll. This is necessary if you need to know what all of the duplicates are. If you don't need to know that, then using a loop allows you to stop when you find the first duplicate. The only case where addAll is likely to perform better is when there are likely to be no duplicates.


if you already have a sorted list, you could just look at any element and the next element, and if they're the same you have dups.

in your question you are using a TreeSet, which would have culled out duplicates already, so if you just need to know if you have duplicates, just check the size of mySet vs the size of myList. if they aren't the same you have dups.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜