开发者

Most appropriate MPI_Datatype for "block decomposition"?

With the help from Jonathan Dursi and osgx, I've now done the "row decomposition" among the processes:

row http://img535.imageshack.us/img535/9118/ghostcells.jpg


Now, I'd like to try the "block decomposition" approach (pictured below): block http://img836.imageshack.us/img836/9682/ghostcellsblock.jpg

How should one go about it? This time, the MPI_Datatype will be necessary, right? Which datatype would be most开发者_如何学C appropriate/easy to use? Or can it plausibly be done without a datatype?


You can always make do without a datatype by just creating a buffer and copying the buffer as count of the underlying type; that's conceptually the simplest. On the other hand, it slower and it actually involves a lot more lines of code. Still, it can be handy when you're trying to get something to work, and then you can implement the datatype-y version along side that and make sure you're getting the same answers.

For the ghost-cell filling, in the i direction you don't need a type, as it's similar to what you had been doing; but you can use one, MPI_Type_contiguous, which just specifies a count of some type (which you can do anyway in your send/recv).

For ghost-cell filling in the j direction, probably easiest is to use MPI_Type_Vector. If you're sending the rightmost column of (say) an array with i=0..N-1, j=0..M-1 you want to send a vector with count=N, blocksize=1, stride=M. That is, you're sending count chunks of 1 value, each separated by M values in the array.

You can also use MPI_Type_create_subarray to pull out just the region of the array you want; that's probably a little overkill in this case.

Now, if as in your previous question you want to be able at some point to gather all the sub-arrays onto one processor, you'll probably be using subarrays, and part of the question is answered here: MPI_Type_create_subarray and MPI_Gather . Note that if your array chunks are of different sizes, though, then things start getting a little tricker.

(Actually, why are you doing the gather onto one processor, anyway? That'll eventually be a scalability bottleneck. If you're doing it for I/O, once you're comfortable with data types, you can use MPI-IO for this..)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜