Why does the compiler ignore OpenMP pragmas?

2023-02-06 20:14 问答作者：

In the following C code I am using OpenMP in a nested loop. Since race condition occurs, I want to perform atomic operations at the end:

double mysumallatomic() {

  double S2 = 0.;
  #pragma omp parallel for shared(S2)
  for(int a=0; a<128; a++){
    for(int b=0; b<128;b++){
      double myterm = (double)a*b;
      #pragma omp atomic
      S2 += myterm;
    }
  }
  return S2;
}

The thing is that #pragma omp atomic has no effect on the program behaviour, even if I remove it, nothing happens. Even if I change it to #pragma oh_my_god, I get no error!

I wonder what is going wrong here, whether I can tell the compiler to be more strict when checking omp pragmas or why I do not get an error when I make the last change

PS: For compilation I use:

gcc-4.2 -fopenmp main.c functions.c -o main_elec_gcc.exe

PS2: New cod开发者_运维问答e that gives me the same problem and based on gillespie idea:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <omp.h>
#include <math.h>

#define NRACK 64
#define NSTARS 1024

double mysumallatomic_serial(float rocks[NRACK][3], float moon[NSTARS][3],
                             float qr[NRACK],float ql[NSTARS]) {
  int j,i;
  float temp_div=0.,temp_sqrt=0.;
  float difx,dify,difz;
  float mod2x, mod2y, mod2z;
  double S2 = 0.;

  for(j=0; j<NRACK; j++){
    for(i=0; i<NSTARS;i++){     
      difx=rocks[j][0]-moon[i][0];
      dify=rocks[j][1]-moon[i][1];
      difz=rocks[j][2]-moon[i][2];
      mod2x=difx*difx;
      mod2y=dify*dify;
      mod2z=difz*difz;
      temp_sqrt=sqrt(mod2x+mod2y+mod2z);
      temp_div=1/temp_sqrt;
      S2 += ql[i]*temp_div*qr[j];       
    }
  }
  return S2;
}

double mysumallatomic(float rocks[NRACK][3], float moon[NSTARS][3], 
                      float qr[NRACK],float ql[NSTARS]) {
  float temp_div=0.,temp_sqrt=0.;
  float difx,dify,difz;
  float mod2x, mod2y, mod2z;
  double S2 = 0.;

  #pragma omp parallel for shared(S2)
  for(int j=0; j<NRACK; j++){
    for(int i=0; i<NSTARS;i++){
      difx=rocks[j][0]-moon[i][0];
      dify=rocks[j][1]-moon[i][1];
      difz=rocks[j][2]-moon[i][2];
      mod2x=difx*difx;
      mod2y=dify*dify;
      mod2z=difz*difz;
      temp_sqrt=sqrt(mod2x+mod2y+mod2z);
      temp_div=1/temp_sqrt;
      float myterm=ql[i]*temp_div*qr[j];    
      #pragma omp atomic
      S2 += myterm;
    }
  }
  return S2;
}
int main(int argc, char *argv[]) {
  float rocks[NRACK][3], moon[NSTARS][3];
  float qr[NRACK], ql[NSTARS];
  int i,j;

  for(j=0;j<NRACK;j++){
    rocks[j][0]=j;
    rocks[j][1]=j+1;
    rocks[j][2]=j+2;
    qr[j] = j*1e-4+1e-3;
    //qr[j] = 1;
  }

  for(i=0;i<NSTARS;i++){
    moon[i][0]=12000+i;
    moon[i][1]=12000+i+1;
    moon[i][2]=12000+i+2;
    ql[i] = i*1e-3 +1e-2 ;
    //ql[i] = 1 ;
  }
  printf(" serial: %f\n", mysumallatomic_serial(rocks,moon,qr,ql));
  printf(" openmp: %f\n", mysumallatomic(rocks,moon,qr,ql));
  return(0);
}

Using the flag -Wall highlights pragma errors. For example, when I misspell atomic I get the following warning.

main.c:15: warning: ignoring #pragma omp atomic1
I'm sure you know, but just in case, your example should be handled with a reduction

When you use omp parallel, the default is for all variables to be shared. This is not what you want in your case. For example, each thread will have a different value difx. Instead, your loop should be:

#pragma omp parallel for default(none),\
private(difx, dify, difz, mod2x, mod2y, mod2z, temp_sqrt, temp_div, i, j),\
shared(rocks, moon, ql, qr), reduction(+:S2)
for(j=0; j<NRACK; j++){
  for(i=0; i<NSTARS;i++){
    difx=rocks[j][0]-moon[i][0];
    dify=rocks[j][1]-moon[i][1];
    difz=rocks[j][2]-moon[i][2];
    mod2x=difx*difx;
    mod2y=dify*dify;
    mod2z=difz*difz;
    temp_sqrt=sqrt(mod2x+mod2y+mod2z);
    temp_div=1/temp_sqrt;
    S2 += ql[i]*temp_div*qr[j];  
  }
}

I know this is an old post, but I think the problem is the order of the parameters of gcc, -fopenmp should be at the end of the compilation line.

First, depending on the implementation, reduction might be better than using atomic. I would try both and time them to see for sure.

Second, if you leave off the atomic, you may or may not see the problem (wrong result) associated with the race. It is all about timing, which from one run to the next can be quite different. I have seen cases where the result was wrong only once in 150,000 runs and others where it has been wrong all the time.

Third, the idea behind pragmas was that the user doesn't need to know about them if they don't have an effect. Besides that, the philosophy in Unix (and its derivatives) is that it is quiet unless there is a problem. Saying that, many implementations have some sort of flag so the user can get more information because they didn't know what was happening. You can try -Wall with gcc, and at least it should flag the oh_my_god pragma as being ignored.

You have

#pragma omp parallel for shared(S2)
  for(int a=0; a<128; a++){
   ....

So the only parallelization will be to the for loop.

If you want to have the atomic or reduction you have to do

#pragma omp parallel 
{
 #pragma omp for shared(S2)
   for(int a=0; a<128; a++){
     for(int b=0; b<128;b++){
       double myterm = (double)a*b;
       #pragma omp atomic
        S2 += myterm;
     } // end of second for
   } // end of 1st for
} // end of parallel code
return S2;
} // end of function

Otherwise everything after # will be comment

继续阅读：c openmp pragma race-condition

Why does the compiler ignore OpenMP pragmas?

更多精彩内容

精彩评论

最新问答

大家觉得三星电视怎么样?？

电动幕布挂不平会不会有皱纹？

海信激光电视视距是多少,客厅大小怎么匹配?？

如何打开屏幕镜像？

检查输卵管堵了哪家医院好？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？