开发者

Java PriorityQueue with fixed size

I am calculating a large number of possible resulting combinations of an algortihm. To sort this combinations I rate them with a double value und store them in PriorityQueue. Currently, there are about 200k items in that queue which is pretty much memory intesive. Acutally, I only need lets say the best 1000 or 100 of all items in the list. So I just started to ask myself if there is a way to have a priority queue with a fixed si开发者_StackOverflow社区ze in Java. I should behave like this: Is the item better than one of the allready stored? If yes, insert it to the according position and throw the element with the least rating away.

Does anyone have an idea? Thanks very much again!

Marco


que.add(d);
if (que.size() > YOUR_LIMIT)
     que.poll();

or did I missunderstand your question?

edit: forgot to mention that for this to work you probably have to invert your comparTo function since it will throw away the one with highest priority each cycle. (if a is "better" b compare (a, b) should return a positvie number.

example to keep the biggest numbers use something like this:

public int compare(Double first, Double second) {
            // keep the biggest values
            return first > second ? 1 : -1;
        }


MinMaxPriorityQueue, Google Guava

There is indeed a class for maintaining a queue that, when adding an item that would exceed the maximum size of the collection, compares the items to find an item to delete and thereby create room: MinMaxPriorityQueue found in Google Guava as of version 8.

EvictingQueue

By the way, if you merely want deleting the oldest element without doing any comparison of the objects’ values, Google Guava 15 gained the EvictingQueue class.


There is a fixed size priority queue in Apache Lucene: http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/util/PriorityQueue.html

It has excellent performance based on my tests.


Use SortedSet:

SortedSet<Item> items = new TreeSet<Item>(new Comparator<Item>(...));
...
void addItem(Item newItem) {
    if (items.size() > 100) {
         Item lowest = items.first();
         if (newItem.greaterThan(lowest)) {
             items.remove(lowest);
         }
    }

    items.add(newItem);   
}


Just poll() the queue if its least element is less than (in your case, has worse rating than) the current element.

static <V extends Comparable<? super V>> 
PriorityQueue<V> nbest(int n, Iterable<V> valueGenerator) {
    PriorityQueue<V> values = new PriorityQueue<V>();
    for (V value : valueGenerator) {
        if (values.size() == n && value.compareTo(values.peek()) > 0)
            values.poll(); // remove least element, current is better
        if (values.size() < n) // we removed one or haven't filled up, so add
            values.add(value);
    }
    return values;
}

This assumes that you have some sort of combination class that implements Comparable that compares combinations on their rating.

Edit: Just to clarify, the Iterable in my example doesn't need to be pre-populated. For example, here's an Iterable<Integer> that will give you all natural numbers an int can represent:

Iterable<Integer> naturals = new Iterable<Integer>() {
    public Iterator<Integer> iterator() {
        return new Iterator<Integer>() {
            int current = 0;
            @Override
            public boolean hasNext() {
                return current >= 0;
            }
            @Override
            public Integer next() {
                return current++;
            }
            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }
        };
    }
};

Memory consumption is very modest, as you can see - for over 2 billion values, you need two objects (the Iterable and the Iterator) plus one int.

You can of course rather easily adapt my code so it doesn't use an Iterable - I just used it because it's an elegant way to represent a sequence (also, I've been doing too much Python and C# ☺).


A better approach would be to more tightly moderate what goes on the queue, removing and appending to it as the program runs. It sounds like there would be some room to exclude some the items before you add them on the queue. It would be simpler than reinventing the wheel so to speak.


It seems natural to just keep the top 1000 each time you add an item, but the PriorityQueue doesn't offer anything to achieve that gracefully. Maybe you can, instead of using a PriorityQueue, do something like this in a method:

List<Double> list = new ArrayList<Double>();
...
list.add(newOutput);
Collections.sort(list);
list = list.subList(0, 1000);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜