开发者

Java Code Review: Merge sorted lists into a single sorted list [closed]

Closed. This question is opinion-based. It is not currently accepting answers.

Want to imp开发者_运维知识库rove this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 9 years ago.

Improve this question

I want to merge sorted lists into a single list. How is this solution? I believe it runs in O(n) time. Any glaring flaws, inefficiencies, or stylistic issues?

I don't really like the idiom of setting a flag for "this is the first iteration" and using it to make sure "lowest" has a default value. Is there a better way around that?

public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
    List<T> result = new ArrayList<T>();

    int totalSize = 0; // every element in the set
    for (List<T> l : lists) {
        totalSize += l.size();
    }

    boolean first; //awkward
    List<T> lowest = lists.iterator().next(); // the list with the lowest item to add

    while (result.size() < totalSize) { // while we still have something to add
        first = true;

        for (List<T> l : lists) {
            if (! l.isEmpty()) {
                if (first) {
                    lowest = l;
                    first = false;
                }
                else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
                    lowest = l;
                }
            }
        }
        result.add(lowest.get(0));
        lowest.remove(0);
    }
    return result;
}

Note: this isn't homework, but it isn't for production code, either.


Efficiency will suck if lists contains an ArrayList, since lowest.remove(0) will take linear time in the length of the list, making your algorithm O(n^2).

I'd do:

List<T> result = new ArrayList<T>();
for (List<T> list : lists) {
    result.addAll(list);
}
Collections.sort(result);

which is in O(n log n), and leaves far less tedious code to test, debug and maintain.


To expand on Anton's comment:

By placing the latest result from each List, along with an indicator of whch list it is, into a heap, then continually take the top off the heap, and put a new item on the heap from the list belonging to the item you just took off.

Java's PriorityQueue can provide the heap implementation.


Your solution is probably the fastest one. SortedLists have an insert cost of log(n), so you'll end up with M log (M) (where M is the total size of the lists).

Adding them to one list and sorting, while easier to read, is still M log(M).

Your solution is just M.

You can clean up your code a bit by sizing the result list, and by using a reference to the lowest list instead of a boolean.

public static <T extends Comparable<? super T>> List<T> merge(Set<List<T>> lists) {
    int totalSize = 0; // every element in the set
    for (List<T> l : lists) {
        totalSize += l.size();
    }

    List<T> result = new ArrayList<T>(totalSize);

    List<T> lowest;

    while (result.size() < totalSize) { // while we still have something to add
        lowest = null;

        for (List<T> l : lists) {
            if (! l.isEmpty()) {
                if (lowest == null) {
                    lowest = l;
                } else if (l.get(0).compareTo(lowest.get(0)) <= 0) {
                    lowest = l;
                }
            }
        }

        result.add(lowest.get(0));
        lowest.remove(0);
    }

    return result;
}

If you're really particular, use a List object as input, and lowest can be initialized to be lists.get(0) and you can skip the null check.


This is a really old question, but I don't like any of the submitted answers, so this is what I ended up doing.

The solution of just adding them all into one list and sorting is bad because of the log linear complexity (O(m n log(m n))). If that's not important to you, then it's definitely the simplest and most straightforward answer. Your initial solution isn't bad, but it's a little messy, and @Dathan pointed out that the complexity is O(m n) for m lists and n total elements. You can reduce this to O(n log(m)) by using a heap to reduce the number of comparisons for each element. I use a helper class that allows me to compare iterables. This way I don't destroy the initial lists, and it should operate with reasonable complexity no matter what type of lists are input. The only flaw I see with the implementation below is that it doesn't support lists with null elements, however this could be fixed with sentinels if desired.

public static <E extends Comparable<? super E>> List<E> merge(Collection<? extends List<? extends E>> lists) {
    PriorityQueue<CompIterator<E>> queue = new PriorityQueue<CompIterator<E>>();
    for (List<? extends E> list : lists)
        if (!list.isEmpty())
            queue.add(new CompIterator<E>(list.iterator()));

    List<E> merged = new ArrayList<E>();
    while (!queue.isEmpty()) {
        CompIterator<E> next = queue.remove();
        merged.add(next.next());
        if (next.hasNext())
            queue.add(next);
    }
    return merged;
}

private static class CompIterator<E extends Comparable<? super E>> implements Iterator<E>, Comparable<CompIterator<E>> {
    E peekElem;
    Iterator<? extends E> it;

    public CompIterator(Iterator<? extends E> it) {
        this.it = it;
        if (it.hasNext()) peekElem = it.next();
        else peekElem = null;
    }

    @Override
    public boolean hasNext() {
        return peekElem != null;
    }

    @Override
    public E next() {
        E ret = peekElem;
        if (it.hasNext()) peekElem = it.next();
        else peekElem = null;
        return ret;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }

    @Override
    public int compareTo(CompIterator<E> o) {
        if (peekElem == null) return 1;
        else return peekElem.compareTo(o.peekElem);
    }

}

Every element of the returned list involves two O(log(m)) heap operations, there is also an initial iteration over all of the lists. Therefore the overall complexity is O(n log(m) + m) for n total elements and m lists. making this always faster than concatenating and sorting.


Since Balus and meriton have together given an excellent response to your question about the algorithm, I'll speak to your aside about the "first" idiom.

There are definitely other approaches (like setting lowest to a 'magic' value), but I happen to feel that "first" (to which I'd probably give a longer name, but that's being pedantic) is the best, because it's very clear. Presence of a boolean like "first" is a clear signal that your loop will do something special the first time through. It helps the reader.

Of course you don't need it if you take the Balus/meriton approach, but it's a situation which crops up.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜