开发者

Efficient algorithm to copy tiles in a big matrix

I have N square matrices all of the same size MxM that have to be copied in a matrix that contains NxN matrices, arranged in a symmetrical way. The two halves, upper and lower, contain transposed version of the same matrices like in this scheme.

N = 4

m1 m2 m3 m4
m2'm1 m2 m3
m3'm2'm1 m2
m4'm3'm2'm1

The algorithm that produces data initially fills just the upper row and the first column, leaving the rest empty.

m1 m2 m3 m4
m2'0  0  0 
m3'0  0  0 
m4'0  0  0 

I would like to find an efficient indexing scheme to fill all the big matrix starting from the elements of the line that has been already filled. Remember that m1...mn are square matrices of size MxM, and the matrix is arranged in column-major order. The matrix is not so big so no need to exploit much locality and cache-related things.

The trivial algorithm is like below, where X is the matrix.

  int toX = 0, fromX = 0, toY = 0, fromY = 0; 
  for (int i开发者_如何学C = 1; i < N; ++i) {
    for (int j = 1; j < N; ++j) {
      for (int ii = 0; ii < M; ++ii) {
        for (int jj = 0; jj < M; ++jj) {
          fromX = (i - 1) * dim + ii;
          fromY = (j - 1) * dim + jj;
          toX = i * dim + ii;
          toY = j * dim + jj;
          X(toX, toY) = X(fromX, fromY);
        } 
      }
    }
  }

Can you find a better way?


Depending on your application, it might be unnecessary to store all those transposed matrices. If m1 is symmetric, you could even cull the lower half of the m1 matrices.

In fact, it might even be feasible to leave all those matrices alone and do your matrix-operations block-wise (addition and multiplication with a scalar are simple, multiplication with a vector would be a bit more complicated)

If you really need the whole matrix, you might get a slightly lower operation count by filling the matrix diagonally, i.e. by doing something like this:

int toX = 0, fromX = 0, toY = 0, fromY = 0;

// m1 (note that this part can be sped up further if m1 is symmetric)

for (int ii = 0; ii<M; ii++){        
    for (int jj = 0; jj<M; jj++){
        fromX = ii;
        fromY = jj;
        toX = fromX;
        toY = fromY;
        for (int k=1; k<N; k++){            
            toX += dim;
            toY += dim;
            X(toX, toY) = X(fromX, fromY);
        }            
    }        
}    


// m2 to m(N-1)

for (int i = 2; i < N; i++){    
    for (int ii = 0; ii<M; ii++){        
        for (int jj = 0; jj<M; jj++){
            fromX = i*dim+ii;
            fromY = jj;
            toX = fromX;
            toY = fromY;
            for (int k=i; k<N; k++){            
                toX += dim;
                toY += dim;
                X(toX, toY) = X(fromX, fromY);
                X(toY, toX) = X(fromX, fromY);
            }            
        }        
    }    
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜