Efficient way to execute the sequential part(large no of operations + writing file) of a parallel code?

2023-03-23 19:57 问答作者：

I have a C++ code using mpi and is executed in a sequential-parallel-sequential pattern. The above pattern is repeated in a time loop. While validating the code with the serial code, I could get a reduction in time for the parallel part and in fact the reduction is almost linear with the no of processors.

The problem that I am facing is that the time required for the sequential part also increases considerably when using higher no of processors.

The parallel part takes less time to be executed in comparison with total sequential time of the entire program.

Therefore although there is a reduction in time for the parallel part when using higher no of processors, the saving in time is lost considerably due to increase in time while executing the sequential part. Also the sequential part includes a large no of computations at each time step and writing the data to an output file at some specified t开发者_JAVA技巧ime.

All the processors are made to run during the execution of sequential part and the data is gathered to the root processor after the parallel computation and only the root processor is allowed to write the file.

Therefore can anyone suggest what is the efficient way to calculate the serial part (large no of operations + write the file) of the parallel code ? I would also like to clarify on any of the point if required.

Thanks in advance.

First of all, do file writing from separate thread (or process in MPI terms), so other threads can use your cores for computations.

Then, check why your parallel version is much slower than sequential. Often this means you creates too small tasks so communication between threads (synchronization) eats your performance. Think if tasks can be combined into chunks and complete chunks processed in parallel.

And, of course, use any profiler that is good for multithreading environment.

[EDIT]

sequential part = part of your logic that cannot be (and is not) paralleled, do you mean the same? sequential part on multicore can work a bit slower, probably because of OS dispatcher or something like this. It's weird that you see noticable difference.

Disk is sequential by its nature, so writing to disk from many threads don't give any benefits, but can lead to the situation when many threads try to do this simultaneously and waits for each other instead of doing something useful.

BTW, what MPI implementation do you use?

Your problem description is too high-level, provide some pseudo-code or something like this, this can help us to help you.

继续阅读：mpi parallel-processing

Efficient way to execute the sequential part(large no of operations + writing file) of a parallel code?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？