Doing parallel processing in bash?
I've thousands of png files which I like to make smaller with pngcrush
. I've a simple find .. -exec
job, but it's sequential. My machine has quite some resources and I'd make this in paral开发者_JAVA百科lel.
The operation to be performed on every png is:
pngcrush input output && mv output input
Ideally I can specify the maximum number of parallel operations.
Is there a way to do this with bash and/or other shell helpers? I'm Ubuntu or Debian.
You can use xargs
to run multiple processes in parallel:
find /path -print0 | xargs -0 -n 1 -P <nr_procs> sh -c 'pngcrush $1 temp.$$ && mv temp.$$ $1' sh
xargs
will read the list of files produced by find (separated by 0 characters (-0
)) and run the provided command (sh -c '...' sh
) with one parameter at a time (-n 1
). xargs will run <nr_procs>
(-P <nr_procs>
) in parallel.
You can use custom find/xargs
solutions (see Bart Sas' answer), but when things become more complex you have -at least- two powerful options:
parallel
(from package moreutils)- GNU parallel
With GNU Parallel http://www.gnu.org/software/parallel/ it can be done like:
find /path -print0 | parallel -0 pngcrush {} {.}.temp '&&' mv {.}.temp {}
Learn more:
- Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
- Walk through the tutorial (man parallel_tutorial). You command line will love you for it.
精彩评论