OpenMP C parallelisation algorithm

2023-02-13 01:06 问答作者：

in the boo开发者_高级运维k "Using OpenMP" is an example for bad memory access in C and I think this is the main problem in my attempt to parallelism the gaussian algorithm.

The example looks something like this:

k= 0 ;    
for( int j=0; j<n ; j++)
  for(int i = 0; i<n; i++)
       a[i][j] = a[i][j] - a[i][k]*a[k][j] ;

So, I do understand why this causes a bad memory access. In C a 2d array is stored by rows and here in every i step a new row will be copied from memory to cache.

I am trying to find a solution for this, but im not getting a good speed up. The effects of my attempts are minor.

Can someone give me a hint what I can do?

The easiest way would be to swap the for loops, but I want to do it columnwise.

The second attempt:

for( int j=0; j<n-1 ; j+=2)
  for(int i = 0; i<n; i++)
  {
     a[i][j] = a[i][j] - a[i][k]*a[k][j] ;
     a[i][j+1] = a[i][j+1] - a[i][k]*a[k][j+1] ;
  }

didn't make a difference at all.

The third attempt:

for( int j=0; j<n ; j++)
{  
  d= a[k][j] ;
  for(int i = 0; i<n; i++)
  {
    e = a[i][k] ;
    a[i][j] = a[i][j] - e*d ;
  }
}

Thx alot

Greets Stepp

use flat array instead, eg:

#define A(i,j) A[i+j*ldA]

for( int j=0; j<n ; j++)
{  
  d= A(k,j) ;
  ...
}

Your loop order will cause a cache miss on every iteration, as you point out. So just swap the order of the loop statements:

for (int i = 0; i < n; i++)       // now "i" is first
  for (int j = 0; j < n; j++)
       a[i][j] = a[i][j] - a[i][k]*a[k][j];

This will fix the row in a and vary just the columns, which means your memory accesses will be contiguous.

This memory access problem is just related to CACHE usage not to Openmp. To make a good use of cache in general you should access contiguous memory locations. Remember also that if two or more threads are accessing the same memory area then you can have a "false shearing" problem forcing cache to be reloaded unnecessarily. See for example:
http://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads/

继续阅读：memory-access openmp

OpenMP C parallelisation algorithm

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？