C++ OpenMP variable visibility in parallel tasks
Don't understand, where do I get this wrong. If compiled without openmp support, the code works correctly. But with openmp variables seem to get wrong visibility.
I had the following intention. Each thread has its own max_private in which it finds the local maximum. Then a global maximum is found in a critical section.
#include <iostream>
#include <vector>
typedef std::vector<long> Vector;
long max(const Vector& a, const Vector& b)
{
long max = 0;
#pragma omp parallel
{
long max_private = 0;
#pragma omp single
{
for ( Vector::const_iterator a_it = a.begin();
a_it != a.end();
++a_it)
{
#pragma omp task
{
for ( Vector::const_iterator b_it = b.begin();
b_it != b.end();
++b_it)
{
if (*a_it + *b_it > max_private) {
max_private = *a_it + *b_it;
}
}
}
}
}
#pragma omp critical
{
std::cout << max_private << std::endl;
if (max_private > max) {
max = max_private;
}
}
}
return max;
}
int main(int argc, char* argv[])
{
Vector a(100000);
Vector b(10000);
f开发者_如何学编程or (long i = 0; i < a.size(); ++i) {
a[i] = i;
}
for (long i = 0; i < b.size(); ++i) {
b[i] = i * i;
}
std::cout << max(a, b) << std::endl;
return 0;
}
I don't want to use parallel for, because latter I'm going to use data structures that don't support random access iterators.
I used g++-4.4 compiler.
Got a detailed answer at the OpenMP forum. http://openmp.org/forum/viewtopic.php?f=3&t=912&start=0
Had to make max_private threadprivate.
I think what you need is some attributes on the variables. By default all variables are shared, and as you want max_private to be per-thread, you can happily add 'private' clause to its declaration (it might go a bit faster then).
For max, you need to put it inside the omp block, and mark it with 'shared' if it still doesn't work properly (though it should be ok as you have it)
So, I think you want #pragma omp parallel private(max_private)
and put both variables outside the main omp block.
What you need is a reduction of openmp to find the maximum. It's availaible in OpenMP 3.0
Do the reduction on the for
long max(const Vector& a, const Vector& b)
{
long max_val = 0; // <- change the name for clarity
#pragma omp parallel reduction(max : max_val)
{
[...]
#pragma omp critical
{
std::cout << max_private << std::endl;
if (max_private > max_val) {
max_val = max_private;
}
}
}
return max_val;
}
max_val is private in each thread and at the end the loop the reduction happens and return the maximum of each open_mp threads.
精彩评论