Parallel for_each using openmp
Why does this code not parallelize 开发者_StackOverflow中文版std::for_each() when it works perfectly fine with std::sort()?
How do I fix it?
g++ -fopenmp -D_GLIBCXX_PARALLEL=1 -o p p.cc && time ./p sort
GCC 4.3 on Linux.
#include <cstdio>
#include <algorithm>
#include <vector>
#include <cstring>
void delay()
{
for(int c = 0; c < 1000000; c++) {
}
}
int lt(int a, int b)
{
delay();
return a < b;
}
void noop(int a)
{
delay();
}
int main(int argc, char **argv)
{
if (argc < 2) {
printf("%s <sort | for_each>\n", argv[0]);
return 1;
}
std::vector<int> foo(10000);
if (!strcmp(argv[1], "sort")) {
std::sort(foo.begin(), foo.end(), lt);
} else if (!strcmp(argv[1], "for_each")) {
std::for_each(foo.begin(), foo.end(), noop);
}
}
Just compiling with -D_GLIBCXX_PARALLEL
does not necessarily parallelize all algorithms (see here):
Please note that this doesn't necessarily mean that everything will end up being executed in a parallel manner, but rather that the heuristics and settings coded into the parallel versions will be used to determine if all, some, or no algorithms will be executed using parallel variants.
However the Configuration and tuning Chapter might help you to force parallelization.
Just a note to your "Benchmark":
std::sort
and std::for_each
won't necessarily call delay()
the same number of times. std::for_each
calls the delay method for N
times, std::sort
calls it for something between N log(N)
and N^2
times (see reference). Thus measuring the execution time gives you only a weak indication about parallelization.
精彩评论