开发者

Which STL container has facilities for slicing into two or more containers?

Say I have a vector / list whatever of ints populated with 2300 values

I want to be able to easily slice this into 4 vectors /lists (not necessarily of equal size).

e.g.

开发者_StackOverflow中文版
vec1 ( elements 0 - 500 )
vec2 ( elements 501 - 999)
vec3 ( elements 1001 - 1499)

etc.


A common way to do it would be to use the one container, and just define separate iterator ranges over it.

std::vector<int> vec(2300);

it0 = vec.begin();
it1 = vec.begin() + 500;
it2 = vec.begin() + 1000;
it3 = vec.begin() + 1500;
it4 = vec.begin() + 2000;
it5 = vec.end();

Now, the first range is simply defined by the iterators it0 and it1. The second by it1 and it2, and so on.

So, if you want to apply a function to every element in the third range, you'd simply do this:

std::for_each(it2, it3, somefunc);

Actually copying the elements into separate containers may be unnecessary, and would carry a performance cost.


std::list would be the best choice, as you just build lists by joining pointers. Finding the exact place to slice would be the problem, though, because you have to reach that point in the list iterator to make the cut.

EDIT:

As per comments (thanks for the insights), maybe using std::vector<int> and iterators is a good idea. However, with plain iterators, you loose the length of the vector, so I propose using, for instance, a boost::range_iterator:

std::vector<int> vec(2300);

it0 = vec.begin();
it1 = vec.begin() + 500;
it2 = vec.begin() + 1000;
it3 = vec.begin() + 1500;
it4 = vec.begin() + 2000;
it5 = vec.end;

typedef boost::iterator_range< std::vector<int>::iterator > my_slice_t;

my_slice_t slice1 = boost::make_iterator_range(it0, it1);
...

Then, you can use slice1 as a normal, underlying std::vector<int> as per iteration:

std::for_each(slice1.begin(), slice1.end(), /* stuff */);


See the fourth std::vector<> constructor documented here.

// given std::vector<T> vec with 2300 elements
std::vector<T> vec1(vec.begin(), vec.begin() + 500);
std::vector<T> vec2(vec.begin() + 500, vec.begin() + 1000);
std::vector<T> vec3(vec.begin() + 1000, vec.begin() + 1500);
std::vector<T> vec4(vec.begin() + 1500, vec.begin() + 2000);
std::vector<T> vec5(vec.begin() + 2000, vec.end());


Actually it is doable with the vector container

#include <vector>
#include <iostream>

using namespace std ;

int main()
{
  vector<int> ints ;
  vector<int> ints_sliced;

  int i ;

  // populate
  for( i = 0 ; i < 100 ; i++ ) 
    ints.push_back(i) ;

  // slice from 10-19
  ints_sliced.insert(ints_sliced.begin(), ints.begin()+10, ints.begin()+20) ;

  // inspect
  vector<int>::iterator it ;
  for( it = ints_sliced.begin() ; it != ints_sliced.end() ; it++ )
    cout << *it << endl ;



}


Oh, if you happen to be using g++ and GNU stdlibc++, you can use roughly

g++ -march=native -O3 -ftree-vectorize ...

If you also throw in GNU OpenMP support (libgomp) you can benefit (evaluate, profile!) from automatic parallelization of standard algorithms,

g++ -D_GLIBCXX_PARALLEL -fopenmp -march=native -O3 .... -lgomp

YMMV - I wanted to just throw this out there, because e.g. the parallel for_each seems to be close to what you want (but, automagic and self-adapting to container size, iterator type and number of processors)


In addition to @jalf's correct observation that actually copying the subvectors into fresh vectors might be a waste of time and space, let me point at valarray.

Intro: valarray

Valarray may be more complicated, but especially in the face of parallel processing, might lead to better ways to subvector work for the different threads. Things to look for:

  1. algorithmic pre-science (if locations in a certain pattern have a certain property (e.g. are known to be zero), you can hand it to an optimized worker for those values)
  2. subvector alignment (the aligment can make or break the availability of SIMD, SSE4 optimized versions; have a look at gcc -ftree-vectorizer for more background)

Now valarrays have quite a number of 'obscure' operations and tricks to them (gslices; basically revectored array dimensions to address the original array) that I won't go into here, but suffice it to say, if you want to do number crunching across subsets of contiguous arrays of (mainly) floating points[1], it will pay to read up on those.

Mandatory (braindead) teaser

// mask_array example
#include <iostream>
#include <valarray>
using namespace std;

int main ()
{
  valarray<int> myarray (10);
  for (int i=0; i<10; ++i) myarray[i]=i;   //  0  1  2  3  4  5  6  7  8  9

  valarray<bool> mymask (10);
  for (int i=0; i<10; ++i)
    mymask[i]= ((i%2)==1);                 //  f  t  f  t  f  t  f  t  f  t

  myarray[mymask] *= valarray<int>(10,5);  //  0 10  2 30  4 50  6 70  8 90
  myarray[!mymask] = 0;                    //  0 10  0 30  0 50  0 70  0 90

  cout << "myarray:\n";
  for (size_t i=0; i<myarray.size(); ++i)
      cout << myarray[i] << ' ';
  cout << endl;

  return 0;
}

This was copied verbatim from the above link, you will want to adapt to your specific need. There was probably a good reason why you kept the endgoal a bit vague, so I'll happily leave the rest of the work to you!

Wrapup

If you really want to go all the way, however, you should start looking at the big guns (Blitz++, et al.).

[1] these have historically been the focus for vectorized CPU instruction sets. However, als @jalf notes, SSE2 and higher includes SIMD integer instructions as well

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜