Number of different elements in an array

2022-12-24 02:56 问答作者：

Is it possible to compute the number of different elements in an array in linear time and constant space? Let us say it's an array o开发者_Python百科f long integers, and you can not allocate an array of length sizeof(long).

P.S. Not homework, just curious. I've got a book that sort of implies that it is possible.

This is the Element uniqueness problem, for which the lower bound is Ω( n log n ), for comparison-based models. The obvious hashing or bucket sorting solution all requires linear space too, so I'm not sure this is possible.

You can't use constant space. You can use O(number of different elements) space; that's what a HashSet does.

You can use any sorting algorithm and count the number of different adjacent elements in the array.

I do not think this can be done in linear time. One algorithm to solve in O(n log n) requires first sorting the array (then the comparisons become trivial).

If you are guaranteed that the numbers in the array are bounded above and below, by say a and b, then you could allocate an array of size b - a, and use it to keep track of which numbers have been seen.

i.e., you would move through your input array take each number, and mark a true in your target array at that spot. You would increment a counter of distinct numbers only when you encounter a number whose position in your storage array is false.

Assuming we can partially destroy the input, here's an algorithm for n words of O(log n) bits.

Find the element of order sqrt(n) via linear-time selection. Partition the array using this element as a pivot (O(n)). Using brute force, count the number of different elements in the partition of length sqrt(n). (This is O(sqrt(n)^2) = O(n).) Now use an in-place radix sort on the rest, where each "digit" is log(sqrt(n)) = log(n)/2 bits and we use the first partition to store the digit counts.

If you consider streaming algorithms only ( http://en.wikipedia.org/wiki/Streaming_algorithm ), then it's impossible to get an exact answer with o(n) bits of storage via a communication complexity lower bound ( http://en.wikipedia.org/wiki/Communication_complexity ), but possible to approximate the answer using randomness and little space (Alon, Matias, and Szegedy).

This can be done with a bucket approach when assuming that there are only a constant number of different values. Make a flag for each value (still constant space). Traverse the list and flag the occured values. If you happen to flag an already flagged value, you've found a duplicate. You have to traverse the buckets for each element in the list. But that's still linear time.

继续阅读：algorithm arrays language-agnostic

Number of different elements in an array

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？