Initializing a C++ vector to random values... fast

2022-12-31 06:37 问答作者：

Hey, id like to make this as fast as possible because it gets called A LOT in a program i'm writing, so is there any faster way to initialize a C++ vector to r开发者_StackOverflow社区andom values than:

double range;//set to the range of a particular function i want to evaluate.
std::vector<double> x(30, 0.0);
for (int i=0;i<x.size();i++) {
    x.at(i) = (rand()/(double)RAND_MAX)*range;
}

EDIT:Fixed x's initializer.

~~Right now, this should be really fast since the loop won't execute.~~

Personally, I'd probably use something like this:

struct gen_rand { 
    double range;
public:
    gen_rand(double r=1.0) : range(r) {}
    double operator()() { 
        return (rand()/(double)RAND_MAX) * range;
    }
};

std::vector<double> x(num_items);
std::generate_n(x.begin(), num_items, gen_rand());

Edit: It's purely a micro-optimization that might make no difference at all, but you might consider rearranging the computation to get something like:

struct gen_rand { 
    double factor;
public:
    gen_rand(double r=1.0) : factor(range/RAND_MAX) {}
    double operator()() { 
        return rand() * factor;
    }
};

Of course, there's a really good chance the compiler will already do this (or something equivalent) but it won't hurt to try it anyway (though it's really only likely to help with optimization turned off).

Edit2: "sbi" (as is usually the case) is right: you might gain a bit by initially reserving space, then using an insert iterator to put the data into place:

std::vector<double> x;
x.reserve(num_items);
std::generate_n(std::back_inserter(x), num_items, gen_rand());

As before, we're into such microscopic optimization, I'm not at all sure I'd really expect to see a difference at all. In particular, since this is all done with templates, there's a pretty good chance most (if not all) the code will be generated inline. In that case, the optimizer is likely to notice that the initial data all gets overwritten, and skip initializing it.

In the end, however, nearly the only part that's really likely to make a significant difference is getting rid of the .at(i). The others might, but with optimizations turned on, I wouldn't really expect them to.

I have been using Jerry Coffin's functor method for some time, but with the arrival of C++11, we have loads of cool new random number functionality. To fill an array with random float values we can now do something like the following . . .

const size_t elements = 300;
std::vector<float> y(elements);    
std::uniform_real_distribution<float> distribution(0.0f, 2.0f); //Values between 0 and 2
std::mt19937 engine; // Mersenne twister MT19937
auto generator = std::bind(distribution, engine);
std::generate_n(y.begin(), elements, generator);

See the relevant section of Wikipedia for more engines and distributions

Yes, whereas x.at(i) does bounds checking, x[i] does not do so. Also, your code is incorrect as you have failed to specify the size of x in advance. You need to use std::vector<double> x(n), where n is the number of elements that you want to use; otherwise, your loop there will never execute.

Alternatively, you may want to make a custom iterator for generating random values and filling it using the iterator; because the std::vector constructor will initialize its elements, anyway, so if you have a custom iterator class that generates random values you may be able to eliminate a pass over the items.

In terms of implementing an iterator of your own, here is my untested code:

 class random_iterator
 {
     public:
         typedef std::input_iterator_tag iterator_category;
         typedef double value_type;
         typedef int difference_type;
         typedef double* pointer;
         typedef double& reference;

         random_iterator() : _range(1.0), _count(0) {}
         random_iterator(double range, int count) : 
                                         _range(range), _count(count) {}
         random_iterator(const random_iterator& o) : 
                                         _range(o._range), _count(o._count) {}
         ~random_iterator(){}

         double operator*()const{ return ((rand()/(double)RAND_MAX) * _range); }
         int operator-(const random_iterator& o)const{ return o._count-_count; }
         random_iterator& operator++(){ _count--; return *this; }
         random_iterator operator++(int){ random_iterator cpy(*this); _count--; return cpy; }
         bool operator==(const random_iterator& o)const{ return _count==o._count; }
         bool operator!=(const random_iterator& o)const{ return _count!=o._count; }

     private:
         double _range;
         int _count;
 };

With the code above, it should be possible to use:

std::vector<double> x(random_iterator(range,number),random_iterator());

That said, the generate code for the other solution given is simpler, and frankly, I would just explicitly fill the vector without resorting to anything fancy like this.... but it is kind of cool to think about.

#include <iostream>
#include <vector>
#include <algorithm>

struct functor {
   functor(double v):val(v) {}
   double operator()() const {
      return (rand()/(double)RAND_MAX)*val;
   }
private:
   double val;
};

int main(int argc, const char** argv) {
   const int size = 10;
   const double range = 3.0f;

   std::vector<double> dvec;
   std::generate_n(std::back_inserter(dvec), size, functor(range));

   // print all
   std::copy(dvec.begin(), dvec.end(), (std::ostream_iterator<double>(std::cout, "\n")));

   return 0;
}

опоздал :(

You may consider using a pseudo-random number generator that gives output as a sequence. Since most PRNGs just provide a sequence anyways, that will be a lot more efficient than simply calling rand() over and over again.

But then, I think I really need to know more about your situation.

Why does this piece of code execute so much? Can you restructure your code to avoid re-generating random data so frequently?
How big are your vectors?
How "good" does your random number generator need to be? High-quality distributions tend to be more expensive to calculate.
If your vectors are large, are you reusing their buffer space, or are you throwing it away and reallocating it elsewhere? Creating new vectors willy-nilly is a great way to destroy your cache.

@Jerry Coffin's answer looks very good. Two other thoughts, though:

Inlining - All of your vector access will be very fast, but if the call to rand() is out-of-line, the function call overhead might dominate. If that's the case, you may need to roll your own pseudorandom number generator.
SIMD - If you're going to roll your own PRNG, you might as well make it compute 2 doubles (or 4 floats) at once. This will reduce the number of the int-to-float conversions as well as the multiplications. I've never tried it, but apparently there's a SIMD version of the Mersenne Twister that's quite good. A simple linear congruential generator might be good enough too (and that's probably what rand() is using already).

int main() {
  int size = 10;
  srand(time(NULL));
  std::vector<int> vec(size);
  std::generate(vec.begin(), vec.end(), rand);

  std::vector<int> vec_2(size);
  std::generate(vec_2.begin(), vec_2.end(), [](){ return rand() % 50;})
}

You need to include vector, algorithm, time, cstdlib.

The way I think about these is a rubber-meets-the-road approach.
In other words, there are certain minimal things that have to happen, no getting around it, such as:

the rand() function has to be called N times.
the result of rand() has to be converted to double and then multiplied by something.
the resulting numbers have to get stored in consecutive elements of an array.

The object is, at a minimum, to get those things done.

Other concerns, like whether or not to use an std::vector and iterators are fine as long as they don't add any extra cycles. The easiest way to see if they add significant extra cycles is to single-step the code at the assembly language level.

继续阅读：optimization random vector

Initializing a C++ vector to random values... fast

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？