How can I share simple scalar (counter variable) between forks in perl?
I've been writing a program that forks many times and each of the forks may also fork into smaller parts.
Each of the lowest level children is ultimately running a complex calculation and outputting the results in, what I am hoping, is a uniquely named file.
The IDs need to be unique such that when all the children are finished, the parent can reap the children and then collect the data.
As an example to help make this more concrete, each of the children will produce a file $unique_id.storable
containing the data that the respective child has processed.
When the parent finds that all the children are finished, it uses storable to read back in the files into a hash and use the, hopefully unique, $unique_id
as the key.
The problem presents itself when two children are spawned nearly simultaneously. Right now, each of these children ends up running its own independent counter so that multiple children may all create a similarly named $unique_id
, even though the data in those files are indeed unique.
How can I share a counter variable, a mere scalar, between forks?
I realize that questions of interprocess communication are rather common on the interwebs but I notice many solutions address a general problem of sharing arbitrary amounts of data between processes. I merely need to share a single scalar so I wonder if my problem can be handled in a simpler fashion. Ideally, quite ideally really, I would prefer a solution that does not involve a "non-standard" module. I see that IPC::Shareable
is sometimes recommended but I wonder if that may be overkill for my problem, and this is one of those "non-standard" modules, anyway.
Would it be wise if I make my $unique_id
the PID? Is it possible that the parent program running over a course of say, one week, on a heavily used machine, might reuse PIDs and not guarantee uniqueness?
I'd appreciate any advice people can lend.
Why don't you pass the id down? The root process spawns
1
2
...
These, in turn spawn
1.1
1.2
...
2.1
2.2
...
...
and so on.
I'd probably use a slightly different approach: you could handle everything via the filename...
As far as unique PIDs go, yes, it's possible that after a week or so your PIDs will recycle such that they'll not be guaranteed to be unique. However, you could append the date/time to the filename to ensure uniqueness.
To allow the parent to keep track of all the result files it needs to harvest you could simply generate a unique job ID in the parent, then keep this constant down through the tree of children. You could use this job ID as a prefix for the result file, so in the end the parent simply reads all the files with the appropriate prefix.
The filenames will end up looking a little cumbersome, but they're just temp files, right?
The resulting filenames would look something like:
<job_id>_<pid>_<created_time>.storable
Then the parent just looks for all the files <job_id>_*.storable
You could use the pid and be confident that it's unique by having the parent only reap the child after the parent has processed the child's output.
# Wait for a child to terminate, but don't reap it yet.
my $pid = waitpid(-1, WNOWAIT);
# Collect data from the file for child $pid
...
# Reap the child.
waitpid($pid, 0);
But it seems to me that if you could do this, you could use pipes for communication instead of temporary files.
精彩评论