开发者

What is the best way to concatenate two vectors?

I'm using multitreading and want to merge the results. For example:

std::vector<int> A;
std::vec开发者_如何学编程tor<int> B;
std::vector<int> AB;

I want AB to have to contents of A and the contents of B in that order. What's the most efficient way of doing something like this?


AB.reserve( A.size() + B.size() ); // preallocate memory
AB.insert( AB.end(), A.begin(), A.end() );
AB.insert( AB.end(), B.begin(), B.end() );


This is precisely what the member function std::vector::insert is for

std::vector<int> AB = A;
AB.insert(AB.end(), B.begin(), B.end());


Depends on whether you really need to physically concatenate the two vectors or you want to give the appearance of concatenation of the sake of iteration. The boost::join function

http://www.boost.org/doc/libs/1_43_0/libs/range/doc/html/range/reference/utilities/join.html

will give you this.

std::vector<int> v0;
v0.push_back(1);
v0.push_back(2);
v0.push_back(3);

std::vector<int> v1;
v1.push_back(4);
v1.push_back(5);
v1.push_back(6);
...

BOOST_FOREACH(const int & i, boost::join(v0, v1)){
    cout << i << endl;
}

should give you

1
2
3
4
5
6

Note boost::join does not copy the two vectors into a new container but generates a pair of iterators (range) that cover the span of both containers. There will be some performance overhead but maybe less that copying all the data to a new container first.


Based on Kiril V. Lyadvinsky answer, I made a new version. This snippet use template and overloading. With it, you can write vector3 = vector1 + vector2 and vector4 += vector3. Hope it can help.

template <typename T>
std::vector<T> operator+(const std::vector<T> &A, const std::vector<T> &B)
{
    std::vector<T> AB;
    AB.reserve(A.size() + B.size());                // preallocate memory
    AB.insert(AB.end(), A.begin(), A.end());        // add A;
    AB.insert(AB.end(), B.begin(), B.end());        // add B;
    return AB;
}

template <typename T>
std::vector<T> &operator+=(std::vector<T> &A, const std::vector<T> &B)
{
    A.reserve(A.size() + B.size());                // preallocate memory without erase original data
    A.insert(A.end(), B.begin(), B.end());         // add B;
    return A;                                        // here A could be named AB
}


In the direction of Bradgonesurfing's answer, many times one doesn't really need to concatenate two vectors (O(n)), but instead just work with them as if they were concatenated (O(1)). If this is your case, it can be done without the need of Boost libraries.

The trick is to create a vector proxy: a wrapper class which manipulates references to both vectors, externally seen as a single, contiguous one.

USAGE

std::vector<int> A{ 1, 2, 3, 4, 5};
std::vector<int> B{ 10, 20, 30 };

VecProxy<int> AB(A, B);  // ----> O(1). No copies performed.

for (size_t i = 0; i < AB.size(); ++i)
    std::cout << AB[i] << " ";  // 1 2 3 4 5 10 20 30

IMPLEMENTATION

template <class T>
class VecProxy {
private:
    std::vector<T>& v1, v2;
public:
    VecProxy(std::vector<T>& ref1, std::vector<T>& ref2) : v1(ref1), v2(ref2) {}
    const T& operator[](const size_t& i) const;
    const size_t size() const;
};

template <class T>
const T& VecProxy<T>::operator[](const size_t& i) const{
    return (i < v1.size()) ? v1[i] : v2[i - v1.size()];
};

template <class T>
const size_t VecProxy<T>::size() const { return v1.size() + v2.size(); };

MAIN BENEFIT

It's O(1) (constant time) to create it, and with minimal extra memory allocation.

SOME STUFF TO CONSIDER

  • You should only go for it if you really know what you're doing when dealing with references. This solution is intended for the specific purpose of the question made, for which it works pretty well. To employ it in any other context may lead to unexpected behavior if you are not sure on how references work.
  • In this example, AB does not provide a non-const access operator ([ ]). Feel free to include it, but keep in mind: since AB contains references, to assign it values will also affect the original elements within A and/or B. Whether or not this is a desirable feature, it's an application-specific question one should carefully consider.
  • Any changes directly made to either A or B (like assigning values, sorting, etc.) will also "modify" AB. This is not necessarily bad (actually, it can be very handy: AB does never need to be explicitly updated to keep itself synchronized to both A and B), but it's certainly a behavior one must be aware of. Important exception: to resize A and/or B to sth bigger may lead these to be reallocated in memory (for the need of contiguous space), and this would in turn invalidate AB.
  • Because every access to an element is preceded by a test (namely, "i < v1.size()"), VecProxy access time, although constant, is also a bit slower than that of vectors.
  • This approach can be generalized to n vectors. I haven't tried, but it shouldn't be a big deal.


One more simple variant which was not yet mentioned:

copy(A.begin(),A.end(),std::back_inserter(AB));
copy(B.begin(),B.end(),std::back_inserter(AB));

And using merge algorithm:

#include <algorithm> #include <vector> #include <iterator> #include <iostream> #include <sstream> #include <string> template<template<typename, typename...> class Container, class T> std::string toString(const Container<T>& v) { std::stringstream ss; std::copy(v.begin(), v.end(), std::ostream_iterator<T>(ss, "")); return ss.str(); }; int main() { std::vector<int> A(10); std::vector<int> B(5); //zero filled std::vector<int> AB(15); std::for_each(A.begin(), A.end(), [](int& f)->void { f = rand() % 100; }); std::cout << "before merge: " << toString(A) << "\n"; std::cout << "before merge: " << toString(B) << "\n"; merge(B.begin(),B.end(), begin(A), end(A), AB.begin(), [](int&,int&)->bool {}); std::cout << "after merge: " << toString(AB) << "\n"; return 1; }


All the solutions are correct, but I found it easier just write a function to implement this. like this:

template <class T1, class T2>
void ContainerInsert(T1 t1, T2 t2)
{
    t1->insert(t1->end(), t2->begin(), t2->end());
}

That way you can avoid the temporary placement like this:

ContainerInsert(vec, GetSomeVector());


For this use case, if you know beforehand the number of results each thread produces, you could preallocate AB and pass a std::span to each thread. This way the concatenation need not be done. Example:

std::vector<int> AB(total_number_of_results, 0);
std::size_t chunk_length = …;
std::size_t chunk2_start = chunk_length;
std::size_t chunk3_start = 2 * chunk_length; // If needed
…
// Pass these to the worker threads.
std::span<int> A(AB.data(), chunk_length);
std::span<int> B(AB.data() + chunk2_start, chunk_length);
…


My answer is based on Mr.Ronald Souza's original solution. In addition to his original solution, I've written a vector proxy that supports iterators too!

short description for people who are not aware of the context of the original solution: the joined_vector template class (i.e the vector proxy)takes two references of two vectors as constructor arguments, it then treats them as one contiguous vector. My implementation also supports a forward-iterator.

USAGE:

int main()
{
    std::vector<int> a1;
    std::vector<int> a2;

    joined_vector<std::vector<int>> jv(a1,a2);

    for (int i = 0; i < 5; i++)
        a1.push_back(i);
    for (int i = 5; i <=10; i++)
        a2.push_back(i);

    for (auto e : jv)
        std::cout << e<<"\n";
    for (int i = 0; i < jv.size(); i++)
        std::cout << jv[i] << "\n";
    return 0;
}

IMPLEMENTATION:

template<typename _vec>
class joined_vector
{
    _vec& m_vec1;
    _vec& m_vec2;

public:

    struct Iterator
    {
        typedef typename _vec::iterator::value_type type_value;
        typedef typename _vec::iterator::value_type* pointer;
        typedef typename _vec::iterator::value_type& reference;
        typedef std::forward_iterator_tag iterator_category;
        typedef std::ptrdiff_t difference_type;
        _vec* m_vec1;
        _vec* m_vec2;
        Iterator(pointer ptr) :m_ptr(ptr)
        {

        }
        Iterator operator++()
        {
            if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
                m_ptr = &(*m_vec2)[0];
            else
                ++m_ptr;
            return m_ptr;
        }
        Iterator operator++(int)
        {
            pointer curr = m_ptr;
            if (m_vec1->size() > 0 && m_ptr == &(*m_vec1)[m_vec1->size() - 1] && m_vec2->size() != 0)
                m_ptr = &(*m_vec2)[0];
            else
                ++m_ptr;
            return curr;
        }
        reference operator *()
        {
            return *m_ptr;
        }
        pointer operator ->()
        {
            return m_ptr;
        }

        friend bool operator == (Iterator& itr1, Iterator& itr2)
        {
            return itr1.m_ptr == itr2.m_ptr;
        }
        friend bool operator != (Iterator& itr1, Iterator& itr2)
        {
            return itr1.m_ptr != itr2.m_ptr;
        }
    private:
        pointer m_ptr;
    };

    joined_vector(_vec& vec1, _vec& vec2) :m_vec1(vec1), m_vec2(vec2)
    {

    }
    Iterator begin()
    {
        //checkes if m_vec1 is empty and gets the first elemet's address,
        //if it's empty then it get's the first address of the second vector m_vec2
        //if both of them are empty then nullptr is returned as the first pointer
        Iterator itr_beg((m_vec1.size() != 0) ? &m_vec1[0] : ((m_vec2.size() != 0) ? &m_vec2[0] : nullptr));
        itr_beg.m_vec1 = &m_vec1;
        itr_beg.m_vec2 = &m_vec2;
        return itr_beg;
    }
    Iterator end()
    {
        //check if m_vec2 is empty and get the last address of that vector
        //if the second vector is empty then the m_vec1's vector/the first vector's last element's address is taken
        //if both of them are empty then a null pointer is returned as the end pointer
        typename _vec::value_type* p = ((m_vec2.size() != 0) ? &m_vec2[m_vec2.size() - 1] : ((m_vec1.size()) != 0 ? &m_vec1[m_vec1.size() - 1] : nullptr));
        Iterator itr_beg(p != nullptr ? p + 1 : nullptr);
        itr_beg.m_vec1 = &m_vec1;
        itr_beg.m_vec2 = &m_vec2;
        return itr_beg;
    }
    typename _vec::value_type& operator [](int i)
    {
        if (i < m_vec1.size())
            return m_vec1[i];
        else
            return m_vec2[i - m_vec1.size()];
    }
    size_t size()
    {
        return m_vec1.size() + m_vec2.size();
    }

};


If your vectors are sorted*, check out set_union from <algorithm>.

set_union(A.begin(), A.end(), B.begin(), B.end(), AB.begin());

There's a more thorough example in the link.



template<typename T>
vector<T> CombineVectors(vector<T> &&a, vector<T> &&b) {
  vector<T> ab = move(a);
  ab.insert(
      ab.end(), make_move_iterator(b.begin()), make_move_iterator(b.end()));
  return ab;
}


template<typename T>
vector<T> CombineVectors(const vector<T> &a, const vector<T> &b) {
  vector<T> ab = a;
  ab.insert(ab.end(), b.begin(), b.end());
  return ab;
}

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜