Merge sorted arrays - Efficient solution

2023-01-08 10:05 问答作者：

Goal here is to merge multiple arrays which are already sorted into a resultant array.

I've written the following solution and wondering if there is a way to improve the solution

/*
    Goal is to merge all sorted arrays
*/
void mergeAll(const vector< vector<int> >& listOfIntegers,  vector<int>& result)
{

    int totalNumbers = listOfIntegers.size();
    vector<int> curpos;
    int currow = 0 , minElement , foundMinAt = 0;
    curpos.reserve(totalNumbers);

    // Set the current position that was travered to 0 in all the array elements
    for ( int i = 0; i < totalNumbers; ++i)
    {
        curpos.push_back(0);
    }

    for ( ; ; )
    {
        /*  Find the first minimum 
            Which is basically the first element in the array that hasn't been fully traversed
        */

        for ( currow = 0 ; currow < totalNumbers ; ++currow)
        {
            if ( curpos[currow] < listOfIntegers[currow].size() )
            {
                minElement = listOfIntegers[currow][curpos[currow] ];
                foundMinAt = currow;
                break;
            }
        }
        /* If all the elements were traversed in all the arrays, then no further work needs to be done */
        if ( !(currow < totalNumbers ) )
            break;
        /* 
            Traverse each of the array and find out the first available minimum value
        */
        for ( ;currow < totalNumbers; ++currow)
        {
            if ( listOfIntegers[currow][curpos[currow] ] < minElement )
            {
                minElement = listOfIntegers[currow][curpos[currow] ];
                foundMinAt = currow;
            }
        }
        /* 
            Store the minimum into the resultant array 
            and increment the element traversed
        */
        result.push_back(minElement);
        ++curpos[foundMinAt];
    }
}

The corresponding main goes like this.

int main()
{
    vector< vector<int> > myInt;
    vector<int> result;

    myInt.push_back(vector<int>() );
    myInt.push_back(vector<int>() );
    myInt.push_back(vect开发者_如何学Cor<int>() );

    myInt[0].push_back(10);
    myInt[0].push_back(12);
    myInt[0].push_back(15);


    myInt[1].push_back(20);
    myInt[1].push_back(21);
    myInt[1].push_back(22);

    myInt[2].push_back(14);
    myInt[2].push_back(17);
    myInt[2].push_back(30);

    mergeAll(myInt,result);

    for ( int i = 0; i < result.size() ; ++i)
    {
        cout << result[i] << endl;
    }
}

You can generalize Merge Sort algorithm and work with multiple pointers. Initially, all of them are pointing to the beginning of each array. You maintain these pointers sorted (by the values they point to) in a priority queue. In each step, you remove the smallest element in the heap in O(log n) (n is the number of arrays). You then output the element pointed by the extracted pointer. Now you increment this pointer in one position and if you didn't reach the end of the array, reinsert in the priority queue in O(log n). Proceed this way until the heap is not empty. If there are a total of m elements, the complexity is O(m log n). The elements are output in sorted order this way.

Perhaps I'm misunderstanding the question...and I feel like I'm misunderstanding your solution.

That said, maybe this answer is totally off-base and not helpful.

But, especially with the number of vectors and push_back's you're already using, why do you not just use std::sort?

#include <algorithm>
void mergeAll(const vector<vector<int>> &origList, vector<int> &resultList)
{
    for(int i = 0; i < origList.size(); ++i)
    {
        resultList.insert(resultList.end(), origList[i].begin(), origList[i].end());
    }
    std::sort(resultList.begin(), resultList.end());
}

I apologize if this is totally off from what you're looking for. But it's how I understood the problem and the solution.

std::sort runs in O(N log (N)) http://www.cppreference.com/wiki/stl/algorithm/sort

I've seen some solution on the internet to merge two sorted arrays, but most of them were quite cumbersome. I changed some of the logic to provide the shortest version I can come up with:

void merge(const int list1[], int size1, const int list2[], int size2, int list3[]) {

    // Declaration & Initialization
    int index1 = 0, index2 = 0, index3 = 0;

    // Loop untill both arrays have reached their upper bound.
    while (index1 < size1 || index2 < size2) {

        // Make sure the first array hasn't reached 
        // its upper bound already and make sure we 
        // don't compare outside bounds of the second 
        // array.
        if ((list1[index1] <= list2[index2] && index1 < size1) || index2 >= size2) {
            list3[index3] = list1[index1];
            index1++;
        }
        else {
            list3[index3] = list2[index2];
            index2++;
        }
        index3++;
    }
}

If you want to take advantage of multi-threading then a fairly good solution would be to just merge 2 lists at a time.

ie suppose you have 9 lists.

merge list 0 with 1. merge list 2 with 3. merge list 4 with 5. merge list 6 with 7.

These can be performed concurrently.

Then:

merge list 0&1 with 2&3 merge list 4&5 with 6&7

Again these can be performed concurrently.

then merge list 0,1,2&3 with list 4,5,6&7

finally merge list 0,1,2,3,4,5,6&7 with list 8.

Job done.

I'm not sure on the complexity of that but it seems the obvious solution and DOES have the bonus of being multi-threadable to some extent.

Consider the priority-queue implementation in this answer linked in a comment above: Merging 8 sorted lists in c++, which algorithm should I use

It's O(n lg m) time (where n = total number of items and m = number of lists).

All you need is two pointers (or just int index counters), checking for minimum between array A and B, copying the value over to the resultant list, and incrementing the pointer of the array the minimum came from. If you run out of elements on one source array, copy the remainder of the second to the resultant and you're done.

Edit: You can trivially expand this to N arrays.

Edit: Don't trivially expand this to N arrays :-). Do two at a time. Silly me.

If you are merging very many vector together, then you could speed up performance by using a sort of tree to determine which vector contains the smallest element. This is probably not necessary for your application, but comment if it is and I'll try to work it out.

You could just stick them all into a multiset. That will handle the sorting for you.

继续阅读：algorithm stl

Merge sorted arrays - Efficient solution

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？