Data structures question
This question is from an exam I had, and I couldn't solve it and wanted to see what the answer is (this is not homework, as it will not help me in anything but knowledge).
We need to create a data structure for containing elements whose keys are real numbers.
The data structure should have 开发者_JAVA技巧these functions: Build(S, array): Builds the data structure S with n elements in O(n) Insert(S, k) and Delete(S, x) in O(lgn) (k is an element, x is a pointer to it in the data structure) Delete-Minimal-Positive(S): Remove the element with the minimal positive key Mode(S): returns the key that is most frequent in S in O(1)Now, building in O(n) usually means a heap should be used, but that does not allow to find frequencies. I couldn't find any way to do this so. Best I could come up with is building a Red-Black-Tree (O(nlgn)) that will be used to build a frequency heap.
I'm dying to know the answer...
Thanks!
Using just the comparison model, there is no solution to this problem.
The Element Distinctness Problem has provable Omega(nlogn) lower bounds. This (element distinctness) problem is basically the problem of determining if all the elements of an array are distinct.
If there was a solution to your problem, then we could answer the element distinctness problem in O(n) time (find the most frequent element in O(n) time, and see if there are more than one instances of that element, again in O(n) time).
So, I suggest you ask your professor for the computational model.
Well, you can use a hash table to calculate the number of occurrences of distinct real numbers in O(1) amortized time, and then use a standard heap where the items are pairs (real number, number of occurrences) and the heap is sorted according to the number of occurrences field.
When you insert a key or delete a key, you increment or decrement the number of occurrences field by one, or in the extreme cases add or remove a heap element. In both cases you need to percolate up / down because the ordering field has changed.
Assuming the hash table is O(1) operation, you have a standard heap + O(1) hash table and you get all the operations above within the time limits. In particular, you get the "mode" by reading the root element of the heap.
I think the following solution will be acceptable. It based on two data stuctures:
- Red-black tree
- Binary heap
Binary heap holds tuple, that contain (element value, frequence of element), heap is builded on frequencies, so it's give us ability to find mode in O(1).
Red black tree contains a tuple that hold (element value, pointer to same element value in heap)
When you need to insert new element, you will try to find element(it takes O(log n)), if search was succeful, than go to the pointer from element founded in RB-tree, increase frequence, and rebuild heap(also O(log n)). If search didn't find such element than insert it into RB-tree(O(log n)) and to heap with value = (element, 1) (also O(1)), set a pointer in RB-tree to new element in heap.
Initial building will take O(n), because building both structures from set of n element takes O(n).
Sorry, if I am miss something.
For frequencies:
Each entry is bi-directionaly linked to own frequencies/counters (use hash table)
Frequencies are in linked list.
There is need to move frequency up/down over linked list,(deleting inserting element) but for max difference of 1.
Frequencies are thus linked to pointer of +1 and -1 frequency element.
(there are exceptions but can be handled)
精彩评论