Locking output file for shell script invoked multiple times in parallel

2023-02-16 11:44 问答作者：

I have close to a million files over which I want to run a shell script and append the result to a single file.

For example suppos开发者_Go百科e I just want to run wc on the files. So that it runs fast I can parallelize it with xargs. But I do not want the scripts to step over each other when writing the output. It is probably better to write to a few separate files rather than one and then cat them later. But I still want the number of such temporary output files to be significantly smaller than the number of input files. Is there a way to get the kind of locking I want, or is it the case that is always ensured by default?

Is there any utility that will recursively cat two files in parallel?

I can write a script to do that, but have to deal with the temporaries and clean up. So was wondering if there is an utility which does that.

GNU parallel claims that it:

makes sure output from the commands is the same output as you would get had you run the commands sequentially

If that's the case, then I presume it should be safe to simple pipe the output to your file and let parallel handle the intermediate data.

Use the -k option to maintain the order of the output.

Update: (non-Perl solution)

Another alternative would be prll, which is implemented with shell functions with some C extensions. It is less feature-rich compared to GNU parallel but should the the job for basic use cases.

The feature listing claims:

Does internal buffering and locking to prevent mangling/interleaving of output from separate jobs.

so it should meet your needs as long as order of output is not important

However, note on the following statement on this page:

prll generates a lot of status information on STDERR which makes it harder to use the STDERR output of the job directly as input for another program.

Disclaimer: I've tried neither of the tools and am merely quoting from their respective docs.

继续阅读：locking parallel-processing shell xargs

Locking output file for shell script invoked multiple times in parallel

Update: (non-Perl solution)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Update: (non-Perl solution)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？