Open MPI to distributed and manipulate 2d array in PGM files

2023-02-12 06:29 问答作者：

I need to use Open MPI to distribute 2d-array in the PGM file among 10 working computers. Then I need to manipulate each value of the array to get a negative image (255-i) and then print the output back. I'm thinking of using mpi_scatter and mpi_gather to distribute the data. The problem now is how to read the 2-d array into sub array and send the sub array to each of the working computer to do the manipulation. I'm writing this program in C.

Can anyone can help me solve this problem or give an开发者_如何学C idea? Thank you.

Below are the example of array in the PGM file:

P2
# created by 'xv balloons_bw.tif'
640 480
255
232 227 220 216 212 209 207 206 205 205 205 207 208 209 210 211 212 
211 211 213 212 211 210 209 210 210 211 212 211 210 210 210 210 211 
210 210 210 210 209 210 209 208 209 208 209 210 209 208 210 209 209 
208 208 208 209 208 208 208 207 207 207 206 207 207 207 207 207 207 
207 207 207 207 205 204 206 205 205 204 204 204 203 202 203 202 201 
201 201 200 199 199 200 199 198 198 198 197 197 198 197 196 195 195 
194 193 192 192 191 191 190 190 190 190 189 189 190 188 188 188 187 
187 187 186 186 186 186 187 186 186 187 188 188 187 186 186 186 185 
186 186 186 187 186 186 186 185 185 187 186 185 186 185 185 186 185 
184 185 186 185 186 186 186 185 186 185 185 185 184 183 184 184 183

The simplest way to read a PGM file would be to use libpgm from the netpbm package.

Your read in a pgm file using:

gray **image;
FILE *fp;
int cols; # num columns
int rows; # num rows
int maxval; # max grayscale value

fp = fopen("input.pgm","r");
image = pgm_readpgm( fp, &cols, &rows, &maxval);

You can now get a negative image by looping across rows/cols:

for (i = 0; i < rows; i++)
    for (j = 0; j < cols; j++)
        image[i][j] = maxval - image[i][j];

The tricky bit would be to distribute the task across your MPI nodes as image may not be contiguous in memory (I haven't checked). One could dig into the code to determine the storage pattern and scatter/gather the arrays accordingly, however there is not guarantee that it won't change in the future (unlikely, but possible) and break your code.

A possible but non-optimal way to do this would be to create a temporary buffer which is contiguous in memory, distribute that, and reconstruct the image later on. E.g.

gray *buffer = malloc(sizeof(gray) * rows * cols);
for (i = 0; i < rows; i++)
    for (j = 0; j < cols; j++)
        buffer[(i*cols)+j] = image[i][j];

Now, we're ready to

scatter buffer across nodes
you may need to broadcast maxval to each node.
each node peforms buffer[n] = maxval - buffer[n];
gather buffer back onto master
reconstruct output image

You can reconstruct the image by writing it back to you image data, or simply print out the pgm file manually if you're familiar with the format

As for datatypes to use for MPI operations, MPI_UNSIGNED would work since gray is a typedef of unsigned int. However, to be strictly forward compatible you can use MPI_BYTE and multiply your send_count by sizeof(gray).

not using libpgm

If you want to read the files in manually, it isn't really too hard since your PGM file is in plain format (P2 instead of P5).

Assuming the format is valid, you'll need to:

Open the file
Skip the first 2 lines
Read in cols and rows : fscanf(fp,"%d %d", &cols, &rows);
Read in maxval : fscanf(fp,"%d", &maxval);
Allocate you buffer according to cols and rows
Read in rest of image by looping across col/rows and repeating fscanf(fp,"%d", &buffer[r][c]);

I would normally agree with Shawn Chin about using existing libraries to do the file reading; in this case I might disagree because the file format is so simple and it's so important for MPI to know how the data is laid out in memory. A 2d nxm array allocated as a contiguous 1-d array of nxm is very different from rows scattered all over memory! As always, this is C's fault for not having real multi-d arrays. On the other hand, you could check out the libnetpbm libraries and see how it's allocated, or as Shawn suggests, copy the whole thing into contiguous memory after reading it in.

Note too that this would actually be easier with the (binary) P5 format, as one could use MPI-IO to read in the data in parallel right at the beginning, rather than having one processor doing all the reading and using scatter/gather to do the data distribution. With ascii files, you never really know how long a record is going to be, which makes coordinated I/O very difficult.

Also note that this really isn't a 2d problem - you are just doing an elementwise operation on every piece of the array. So you can greatly simplify things by just treating the data as a 1d array and ignoring the geometry. This wouldn't be the case if you were (say) applying a 2d filter to the image, as there the geometry matters and you'd have to partition data accordingly; but here we don't care.

Finally, even in this simple case you have to use scatterv and gatherv because the number of cells in the image might not evenly divide by the number of MPI tasks. You could simplify the logic here just by padding the array to make it evenly divide; then you could avoid some of the extra steps here.

So if you have a read_pgm() and write_pgm() that you know return pointers into a single contiguous block of memory, you can do something like this:

int main(int argc, char **argv) {
    int ierr;
    int rank, size;
    int **greys;
    int rows, cols, maxval;
    int ncells;
    int mystart, myend, myncells;
    const int IONODE=0;
    int *disps, *counts, *mydata;
    int *data;

    ierr = MPI_Init(&argc, &argv);
    if (argc != 3) {
        fprintf(stderr,"Usage: %s infile outfile\n",argv[0]);
        fprintf(stderr,"       outputs the negative of the input file.\n");
        return -1;
    }            

    ierr  = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    ierr |= MPI_Comm_size(MPI_COMM_WORLD, &size);
    if (ierr) {
        fprintf(stderr,"Catastrophic MPI problem; exiting\n");
        MPI_Abort(MPI_COMM_WORLD,1);
    }

    if (rank == IONODE) {
        if (read_pgm(argv[1], &greys, &rows, &cols, &maxval)) {
            fprintf(stderr,"Could not read file; exiting\n");
            MPI_Abort(MPI_COMM_WORLD,2);
        }
        ncells = rows*cols;
        disps = (int *)malloc(size * sizeof(int));
        counts= (int *)malloc(size * sizeof(int));
        data = &(greys[0][0]); /* we know all the data is contiguous */
    }

    /* everyone calculate their number of cells */
    ierr = MPI_Bcast(&ncells, 1, MPI_INT, IONODE, MPI_COMM_WORLD);
    myncells = ncells/size;
    mystart = rank*myncells;
    myend   = mystart + myncells - 1;
    if (rank == size-1) myend = ncells-1;
    myncells = (myend-mystart)+1;
    mydata = (int *)malloc(myncells * sizeof(int));

    /* assemble the list of counts.  Might not be equal if don't divide evenly. */
    ierr = MPI_Gather(&myncells, 1, MPI_INT, counts, 1, MPI_INT, IONODE, MPI_COMM_WORLD);
    if (rank == IONODE) {
        disps[0] = 0;
        for (int i=1; i<size; i++) {
            disps[i] = disps[i-1] + counts[i-1];
        }
    }

    /* scatter the data */
    ierr = MPI_Scatterv(data, counts, disps, MPI_INT, mydata, myncells, 
                        MPI_INT, IONODE, MPI_COMM_WORLD);

    /* everyone has to know maxval */
    ierr = MPI_Bcast(&maxval, 1, MPI_INT, IONODE, MPI_COMM_WORLD);

    for (int i=0; i<myncells; i++)
        mydata[i] = maxval-mydata[i];

    /* Gather the data */
    ierr = MPI_Gatherv(mydata, myncells, MPI_INT, data, counts, disps, 
                        MPI_INT, IONODE, MPI_COMM_WORLD);

    if (rank == IONODE) {
        write_pgm(argv[2], greys, rows, cols, maxval);
    }

    free(mydata);
    if (rank == IONODE) {
        free(counts);
        free(disps);
        free(&(greys[0][0]));
        free(greys);
    }
    MPI_Finalize();
    return 0;
}

继续阅读：c mpi multidimensional-array pgm

Open MPI to distributed and manipulate 2d array in PGM files

not using libpgm

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

not using libpgm

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？