开发者

How to count unique items in a list?

How would someone go on counting the number of unique items in a list?

For example say I have {1, 3, 3, 4, 1, 3} and I want to get the number 3 which represent the number of unique items in the list(namely |A|=3 if A={1, 3, 4}). What algorithm would someone use for this?

I have tryied a double loop:

for firstItem to lastItem
  currentItem=a
  for curr开发者_StackOverflow中文版entItem to lastItem
    currentItem=b
    if a==b then numberOfDublicates++
uniqueItems=numberOfItems-numberOfDublicates

That doesn't work as it counts the duplicates more times than actually needed. With the example in the beginning it would be:

  1. For the first loop it would count +1 duplicates for number 1 in the list.
  2. For the second loop it would count +2 duplicates for number 3 in the list.
  3. For the third loop it would count +1 duplicates for number 3 again(overcounting the last '3') and there's where the problem comes in.

Any idea on how to solve this?


Add the items to a HashSet, then check the HashSet's size after you finish.
Assuming that you have a good hash function, this is O(n).


You can check to see if there are any duplicates following the number. If not increment the uniqueCount:

uniqueCount = 0;
for (i=0;i<size;i++) {
  bool isUnique = true;
  for (j=i+1;j<size;j++)
     if (arr[i] == arr[j] {
       isUnique = false;
       break;
     }
  }
  if(isUnique) {
    uniqueCount ++;
  }
}

The above approach is O(N^2) in time and O(1) in space.

Another approach would be to sort the input array which will put duplicate elements next to each other and then look for adjacent array elements. This approach is O(NlgN) in time and O(1) in space.

If you are allowed to use additional space you can get this done in O(N) time and O(N) space by using a hash. The keys for the hash are the array elements and the values are their frequencies.

At the end of hashing you can get the count of only those hash keys which have value of 1.


Sort it using a decent sorting algorithm like mergesort or heapsort (both habe O(n log n) as worst-case) and loop over the sorted list:

sorted_list = sort(list)
unique_count = 0
last = sorted_list[0]

for item in sorted_list[1:]:
  if not item == last:
    unique_count += 1
  last = item


list.sort();
for (i = 0; i < list.size() - 1; i++)
  if (list.get(i)==list.get(i+1)
    duplicates++;


Keep Dictionary and add count in loop

This is how it will look at c#

int[] items = {1, 3, 3, 4, 1, 3};
Dictionary<int,int> dic = new Dictionary<int,int>();
foreach(int item in items)
   dic[item]++

Of course there is LINQ way in C#, but as I understand question is general ;)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜