开发者

Is it possible to do a reduction on an array with openmp?

Does OpenMP natively support reduction of a variable that represents an array?

This would work something like the following...

float* a = (float*) calloc(4*sizeof(float));
omp_set_num_threads(13);
#pragma omp parallel reduction(+:a)
for(i=0;i<4;i++){
   a[i] += 1;  // Thread-local copy of a incremented by something interesting
}
// a now contains [13 13 13 13]

Ideally, there would be something similar for an omp par开发者_运维技巧allel for, and if you have a large enough number of threads for it to make sense, the accumulation would happen via binary tree.


Array reduction is now possible with OpenMP 4.5 for C and C++. Here's an example:

#include <iostream>

int main()
{

  int myArray[6] = {};

  #pragma omp parallel for reduction(+:myArray[:6])
  for (int i=0; i<50; ++i)
  {
    double a = 2.0; // Or something non-trivial justifying the parallelism...
    for (int n = 0; n<6; ++n)
    {
      myArray[n] += a;
    }
  }
  // Print the array elements to see them summed   
  for (int n = 0; n<6; ++n)
  {
    std::cout << myArray[n] << " " << std::endl;
  } 
}

Outputs:

100
100
100
100
100
100

I compiled this with GCC 6.2. You can see which common compiler versions support the OpenMP 4.5 features here: https://www.openmp.org/resources/openmp-compilers-tools/

Note from the comments above that while this is convenient syntax, it may invoke a lot of overheads from creating copies of each array section for each thread.


Only in Fortran in OpenMP 3.0, and probably only with certain compilers.

See the last example (Example 3) on:

http://wikis.sun.com/display/openmp/Fortran+Allocatable+Arrays


Now the latest openMP 4.5 spec has supports of reduction of C/C++ arrays. http://openmp.org/wp/2015/11/openmp-45-specs-released/

And latest GCC 6.1 also has supported this feature. http://openmp.org/wp/2016/05/gcc-61-released-supports-openmp-45/

But I didn't give it a try yet. Wish others can test this feature.


OpenMP cannot perform reductions on array or structure type variables (see restrictions).

You also might want to read up on private and shared clauses. private declares a variable to be private to each thread, where as shared declares a variable to be shared among all threads. I also found the answer to this question very useful with regards to OpenMP and arrays.


OpenMP can perform this operation as of OpenMP 4.5 and GCC 6.3 (and possibly lower) supports it. An example program looks as follows:

#include <vector>
#include <iostream>

int main(){
  std::vector<int> vec;

  #pragma omp declare reduction (merge : std::vector<int> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))

  #pragma omp parallel for default(none) schedule(static) reduction(merge: vec)
  for(int i=0;i<100;i++)
    vec.push_back(i);

  for(const auto x: vec)
    std::cout<<x<<"\n";

  return 0;
}

Note that omp_out and omp_in are special variables and that the type of the declare reduction must match the vector you are planning to reduce on.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜