How to limit number of threads/sub-processes used in a function in bash
My question is how change this code so it will use only 4 threads/sub-processes?
TESTS="a b c d e"
for f开发者_如何学编程 in $TESTS; do
t=$[ ( $RANDOM % 5 ) + 1 ]
sleep $t && echo $f $t &
done
wait
Interesting question. I tried to use xargs for this and I found a way.
Try this:
seq 10 | xargs -i --max-procs=4 bash -c "echo start {}; sleep 3; echo done {}"
--max-procs=4
will ensure that no more than four subprocesses are running at a time.
The output will look like this:
start 2
start 3
start 1
start 4
done 2
done 3
done 1
done 4
start 6
start 5
start 7
start 8
done 6
done 5
start 9
done 8
done 7
start 10
done 9
done 10
Note that the order of execution might not follow the commands in the order you submit them. As you can see 2 started before 1.
Quick and dirty solution: insert this line somewhere inside your for
loop:
while [ $(jobs | wc -l) -ge 4 ] ; do sleep 1 ; done
(assumes you don't already have other background jobs running in the same shell)
I have found another solution for this question using parallel
(part of moreutils
package.)
parallel -j 4 -i bash -c "echo start {}; sleep 2; echo done {};" -- $(seq 10)
-j 4
stands for -j maxjobs
-i
uses the parameters as {}
--
delimits your arguments
The output of this command will be:
start 3
start 4
start 1
start 2
done 4
done 2
done 3
done 1
start 5
start 6
start 7
start 8
done 5
done 6
start 9
done 7
start 10
done 8
done 9
done 10
You can do something like this by using the jobs
builtin:
for f in $TESTS; do
running=($(jobs -rp))
while [ ${#running[@]} -ge 4 ] ; do
sleep 1 # this is not optimal, but you can't use wait here
running=($(jobs -rp))
done
t=$[ ( $RANDOM % 5 ) + 1 ]
sleep $t && echo $f $t &
done
wait
GNU Parallel is designed for this kind of tasks:
TESTS="a b c d e"
for f in $TESTS; do
t=$[ ( $RANDOM % 5 ) + 1 ]
sem -j4 sleep $t && echo $f $t
done
sem --wait
Watch the intro videos to learn more:
http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
This tested script runs 5 jobs at a time and will restart a new job as soon as it does (due to the kill of the sleep 10.9 when we get a SIGCHLD. A simpler version of this could use direct polling (change the sleep 10.9 to sleep 1 and get rid of the trap).
#!/usr/bin/bash
set -o monitor
trap "pkill -P $$ -f 'sleep 10\.9' >&/dev/null" SIGCHLD
totaljobs=15
numjobs=5
worktime=10
curjobs=0
declare -A pidlist
dojob()
{
slot=$1
time=$(echo "$RANDOM * 10 / 32768" | bc -l)
echo Starting job $slot with args $time
sleep $time &
pidlist[$slot]=`jobs -p %%`
curjobs=$(($curjobs + 1))
totaljobs=$(($totaljobs - 1))
}
# start
while [ $curjobs -lt $numjobs -a $totaljobs -gt 0 ]
do
dojob $curjobs
done
# Poll for jobs to die, restarting while we have them
while [ $totaljobs -gt 0 ]
do
for ((i=0;$i < $curjobs;i++))
do
if ! kill -0 ${pidlist[$i]} >&/dev/null
then
dojob $i
break
fi
done
sleep 10.9 >&/dev/null
done
wait
This is my "parallel" unzip loop using bash on AIX:
for z in *.zip ; do
7za x $z >/dev/null
while [ $(jobs -p|wc -l) -ge 4 ] ; do
wait -n
done
done
Notes:
- jobs -p (bash function) lists jobs of immediate parent
- wait -n (bash function) waits for any (one) background process to finish
精彩评论