Halo exchange not working properly in MPI
I'm writing some code which does calculations on a large 3D grid, and uses a halo exchange procedure so that it can work using MPI. I'm getting the wrong results from my code, and I'm pretty sure it's because of the halo exchange not working properly.
Basically I have a large 3D array, a chunk of which is held on each process. Each process has an array which is 2 elements bigger in each dimension than the chunk of data it is holding - so that we can halo exchange into each face of the array without affecting the data stored in the rest of the array. I have the following code to do the halo exchange communication:
MPI_Type_vector(g->ny, g->nx, g->nx, MPI_DOUBLE, &face1);
MPI_Type_commit(&face1);
MPI_Type_vector(2*g->ny, 1, g->nx, MPI_DOUBLE, &face2);
MPI_Type_commit(&face2);
MPI_Type_vector(g->nz, g->nx, g->nx * g->ny, MPI_DOUBLE, &face3);
MPI_Type_commit(&face3);
/* Send to WEST receive from EAST */
MPI_Sendrecv(&(g->data)[current][0][0][0], 1, face1, g->west, tag,
&(g->data)[current][0][0][0], 1, face1, g->开发者_如何学Pythoneast, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
/* Send to EAST receive from WEST */
MPI_Sendrecv(&(g->data)[current][g->nz-1][0][0], 1, face1, g->east, tag,
&(g->data)[current][g->nz-1][0][0], 1, face1, g->west, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
/* Send to NORTH receive from SOUTH */
MPI_Sendrecv(&(g->data)[current][0][0][0], 1, face2, g->north, tag,
&(g->data)[current][0][0][0], 1, face2, g->south, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
/* Send to SOUTH receive from NORTH */
MPI_Sendrecv(&(g->data)[current][0][g->ny-1][0], 1, face2, g->south, tag,
&(g->data)[current][0][0][0], 1, face2, g->north, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
/* Send to UP receive from DOWN */
MPI_Sendrecv(&(g->data)[current][0][0][0], 1, face3, g->up, tag,
&(g->data)[current][0][0][0], 1, face3, g->down, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
/* Send to DOWN receive from UP */
MPI_Sendrecv(&(g->data)[current][0][0][g->nx-1], 1, face3, g->down, tag,
&(g->data)[current][0][0][g->nx-1], 1, face3, g->up, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
g->nx
, g->ny
and g->nz
are the sizes of the array chunk that this process is holding, and g->west
, g->east
, g->north
, g->south
, g->up
and g->down
are the ranks of the adjacent processes in each direction, found using the following code:
/* Who are my neighbours in each direction? */
MPI_Cart_shift( cart_comm, 2, 1, &g->north, &g->south);
MPI_Cart_shift( cart_comm, 1, 1, &g->west, &g->east);
MPI_Cart_shift( cart_comm, 0, 1, &g->up, &g->down);
The array on each process is defined as:
array[2][g->nz][g->ny][g->nx]
(It has two copies because I need one to update into each time through my update routine, once I've done the halo exchange).
Can anyone tell me if I'm doing the communication correctly? Particularly the defining of the vector types. Will the vector types I've defined in the code extract each face of a 3D array? And do the MPI_Sendrecv calls look right?
I'm completely lost as to why my code isn't working, but I'm pretty sure it's communications related.
So I'm a big fan of using MPI_Type_create_subarray for pulling out slices of arrays; it's easier to keep straight than vector types. In general, you can't use a single vector type to describe multi-d guardcells (because there are multiple strides, you need to create vectors of vectors), but I think because you're only using 1 guardcell in each direction here that you're ok.
So let's consider the x-face GC; here you're sending an entire y-z plane to your x-neighbour. In memory, this looks like this given your array layout:
+---------+
| @|
| @|
| @|
| @| z=2
| @|
+---------+
| @|
| @|
| @| z=1
| @|
| @|
+---------+
| @|
^| @|
|| @| z=0
y| @|
| @|
+---------+
x->
so you're looking to send count=(ny*nz) blocks of 1 value, each strided by nx. I'm assuming here nx, ny, and nz include guardcells, and that you're sending the corner values. If you're not sending corner values, subarray is the way to go. I'm also assuming, crucially, that g->data is a contiguous block of nx*ny*nz*2 (or 2 contiguous blocks of nx*ny*nz) doubles, otherwise all is lost.
So your type create should look like
MPI_Type_vector((g->ny*g->nz), 1, g->nx, MPI_DOUBLE, &face1);
MPI_Type_commit(&face1);
Note that we are sending a total of count*blocksize = ny*nz values, which is right, and we are striding over count*stride = nx*ny*nz memory in the process, which is also right.
Ok, so the y face looks like this:
+---------+
|@@@@@@@@@|
| |
| |
| | z=2
| |
+---------+
|@@@@@@@@@|
| |
| | z=1
| |
| |
+---------+
|@@@@@@@@@|
^| |
|| | z=0
y| |
| |
+---------+
x->
So you have nz blocks of nx values, each separated by stride nx*ny. So your type create should look like
MPI_Type_vector(g->nz, g->nx, (g->nx)*(g->ny), MPI_DOUBLE, &face2);
MPI_Type_commit(&face2);
And again double-checking, you're sending count*blocksize = nz*nx values, striding count*stride = nx*ny*nz memory. Check.
Finally, sending z-face data involves sending an entire x-y plane:
+---------+
|@@@@@@@@@|
|@@@@@@@@@|
|@@@@@@@@@| z=2
|@@@@@@@@@|
|@@@@@@@@@|
+---------+
| |
| |
| | z=1
| |
| |
+---------+
| |
^| |
|| | z=0
y| |
| |
+---------+
x->
MPI_Type_vector(1, (g->nx)*(g->ny), 1, MPI_DOUBLE, &face3);
MPI_Type_commit(&face3);
And again double-checking, you're sending count*blocksize = nx*ny values, striding count*stride = nx*ny memory. Check.
Update:
I didn't take a look at your Sendrecvs, but there might be something there, too. Notice that you have to use a pointer to the first piece of data you're sending with an vector data type.
First off, if you have array size nx in the x direction, and you have two guardcells (one on either side), your left guardcell is 0, right is nx-1, and your 'real' data extends from 1..nx-2. So to send your westmost data to your west neighbour, and to receive into your eastmost guardcell from your east neighbour, you would want
/* Send to WEST receive from EAST */
MPI_Sendrecv(&(g->data)[current][0][0][g->nx-2], 1, face1, g->west, westtag,
&(g->data)[current][0][0][0], 1, face1, g->east, westtag,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
/* Send to EAST receive from WEST */
MPI_Sendrecv(&(g->data)[current][0][0][1], 1, face1, g->east, easttag,
&(g->data)[current][0][0][g->nx-1], 1, face1, g->west, easttag,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
(I like to use different tags for each stage of communication, helps keep things sorted.)
Likewise for the other directions.
精彩评论