开发者

Mergesort to sort three input arrays

A Merge algorithm merges two sorted input arrays into a sorted output array, by repeatedly comparing the smallest elements of the two input arrays, and moving the smaller one of the two to the output.

Now we need to merge three sorted input arrays (A1, A2, and A3) of the same length into a (sorted) output array, and there are two methods:

  1. Using the above Merge algorithm to merge A1 and A2 into A4, then using the same algorithm to merge A4 and A3 into the output array.

  2. Revising the above Merge algorithm, by repeatedly comparing the smallest elements of the three input arrays, and moving the smallest one of the three to the output array.

Which of the above two algorithms is more efficient, if only considering the worst case of array element movements (i.e., assignments)?

Which of the above two algorithms is more efficient, if only considering the worst case 开发者_如何转开发of array element comparisons?

Between these two algorithms, which one has a higher overall efficiency in worst case?


If all that you care about are the number of array writes, the second version (three-way merge) is faster than the first algorithm (two instances of two-way merge). The three-way merge algorithm will do exactly 3n writes (where n is the length of any of the sequences), since it merges all three ranges in one pass. The first approach will merge two of the ranges together, doing 2n writes, and will then merge that sequence with the third sequence, doing 3n writes for a net total of 5n writes.

More generally, suppose that you have k ranges of elements, all of length n. If you merge those ranges pairwise, then merge those merges pairwise again, etc., then you will do roughly k/2 merge steps merging ranges of length n, then k/4 merges of ranges of length 2n, then k/8 merges of length 4n, etc. This gives the sum

kn/2 + kn/2 + ... + kn/2 (log n times)

For a net number of array writes that are O(kn lg n). If, on the other hand, you use a k-way comparison at each step, you do exactly kn writes, which is much smaller.

Now, let's think about how many comparisons you do in each setup. In the three-way merge, each element written into the output sequence requires finding the minimum of three values. This requires two comparisons - one to compare the first values of the first two sequences, and one to compare the minimum of those two values to the first value of the third array. Thus for each value written out to the resulting sequence, we use two comparisons, and since there are 3n values written we need to do a total of at most 6n comparisons.

A much better way to do this would be to store the sequences in a min-heap, where sequences are compared by their first element. On each step, we dequeue the sequence from the heap with the smallest first value, write that value to the result, then enqueue tue rest of the sequence back into the heap. With k sequences, this means that each element written out requires at most O(lg k) comparisons, since heap insertion runs in O(lg k). This gives a net runtime of O(kn lg k), since each of the kn elements written out requires O(lg k) processing time.

In the other version, we begin by doing a standard two-way merge, which requires one comparison per element written out, for a net total of 2n comparisons. In the second pass of the merge, in the worst case we do a total of 3n comparisons, since there are 3G elements being merged. This gives a net total of 5n comparisons. If we use the generalized construction for pairwise merging that's described above, we will need to use O(kn lg n) comparisons, since each element written requires one comparison and we do O(kn lg n) writes.

In short, for the specific case of k=3, we have that the three-way merge does 3n writes and 6n comparisons for a net of 9n memory reads and writes. The iterated two-way merge does 5n writes and 5n comparisons for a net total of 10n memory reads and writes, and so the three-way-merge version is better.

If we consider the generalized constructions, the k-way merge does O(nk) writes and O(nk lg k) comparisons for a total of O(nk lg k) memory operations. The iterated two-way merge algorithm does O(nk lg n) writes and O(nk lg n) comparisons for a total of O(nk lg n) memory operations. Thus the k-way merge is asymptotically better for a few long sequences, while the iterated merge sort is faster for many short sequences.

Hope this helps!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜