Multi-threaded BASH programming - generalized method?
Ok, I was running POV-Ray on all the demos, but POV's still single-threaded and wouldn't utilize more than one core. So, I started thinking about a solution in BASH.
I wrote a general function that takes a list of commands and runs them in the designated number of sub-shells. This actually works but I don't like the way it handles accessing the next command in a thread-safe multi-process way:
- It takes, as an argument, a file with commands (1 per line),
- To get the "next" command, each process ("thread") will:
- Waits until it can create a lock file, with: ln $CMDFILE $LOCKFILE
- Read the command from the file,
- Modifies $CMDFILE by removing the first line,
- Removes the $LOCKFILE.
Is there a cleaner way to do this? I couldn't get the sub-shells to read a single line from a FIFO correctly.
Incidentally, the point of this is to enhance what I can do on a BASH command line, and not to find non-bash solutions. I tend to perform a lot of complicated tasks from the command line and want another tool in the toolbox.
Meanwhile, here's the function that handles getting the next line from the file. As you can see, it modifies an on-disk file each time it reads/removes a line. That's what seems hackish, but I'm not coming up with anything better, since FIFO's didn't work w/o setvbuf() in bash.
#
# Get/remove the first line from FILE, using LOCK as a semaphore (with
# short sleep for collisions). Returns the text on standard output,
# returns zero on success, non-zero when fil开发者_StackOverflowe is empty.
#
parallel__nextLine()
{
local line rest file=$1 lock=$2
# Wait for lock...
until ln "${file}" "${lock}" 2>/dev/null
do sleep 1
[ -s "${file}" ] || return $?
done
# Open, read one "line" save "rest" back to the file:
exec 3<"$file"
read line <&3 ; rest=$(cat<&3)
exec 3<&-
# After last line, make sure file is empty:
( [ -z "$rest" ] || echo "$rest" ) > "${file}"
# Remove lock and 'return' the line read:
rm -f "${lock}"
[ -n "$line" ] && echo "$line"
}
#adjust these as required
args_per_proc=1 #1 is fine for long running tasks
procs_in_parallel=4
xargs -n$args_per_proc -P$procs_in_parallel povray < list
Note the nproc
command coming soon to coreutils will auto determine
the number of available processing units which can then be passed to -P
If you need real thread safety, I would recommend to migrate to a better scripting system.
With python, for example, you can create real threads with safe synchronization using semaphores/queues.
sorry to bump this after so long, but I pieced together a fairly good solution for this IMO
It doesnt work perfectly, but it will limit the script to a certain number of child tasks running, and then wait for all the rest at the end.
#!/bin/bash
pids=()
thread() {
local this
while [ ${#} -gt 6 ]; do
this=${1}
wait "$this"
shift
done
pids=($1 $2 $3 $4 $5 $6)
}
for i in 1 2 3 4 5 6 7 8 9 10
do
sleep 5 &
pids=( ${pids[@]-} $(echo $!) )
thread ${pids[@]}
done
for pid in ${pids[@]}
do
wait "$pid"
done
it seems to work great for what I'm doing (handling parallel uploading of a bunch of files at once) and keeps it from breaking my server, while still making sure all the files get uploaded before it finishes the script
I believe you're actually forking processes here, and not threading. I would recommend looking for threading support in a different scripting language like perl
, python
, or ruby
.
精彩评论