开发者

Perl - Can you run threads across multiple machines?

I was wondering if it was possible to run threads in Perl across multiple machines. I'm working in a clustered environment and need to run some of my process in parallel, but am unable to use MPI. If threading is not able to b开发者_开发百科e used across machines, are there any other alternatives I should look at that will allow me to do something similar and not require special modules?


Threads (and forks) in Perl are tied to the same computer as the parent thread / process, so no cross-computer threading / forking. That said, you can use AnyEvent::MP / Coro::MP modules, message-passing extensions to the AnyEvent asynchronous event-loop framework and the Coro co-routine, cooperative threading framework respectively, which let you create a network of nodes doing different tasks on one or multiple machines. See AnyEvent::MP::Intro for details.

As for alternatives not requiring special modules (by which, I guess, you mean modules not in the perl distribution), you could conceivably write a daemon for your tasks and have them communicate over TCP or UDP. Anything beyond that would probably require at least a few modules not installed with Perl, but available from the CPAN.


Have a look at Gearman, a multi-machine job manager queue. It does require special modules; I answered here "just in case" you can in fact use additional modules/infrastructure.

There are Perl bindings, Gearman::XS, which I successfully use in projects where I want specific tasks to be done in an environment where either requesters or workers processes may reside on multiple machines. Works well also for multiple worker processes on one machine and one requester (example: a certain web scraper which requests all links from a page parsed by any worker, but wants to keep control of the results).

The way it works is that you create a "worker" Perl program which has a number of subroutines which perform the action you'd like to perform in a distributed fashion. You launch those worker programs on whichever machines you want and as many times as you want, and have them connect to one (or multiple) master gearman "managers". You then create a requester (gearman client) Perl program which will perform the requests. This can also run on any machine, and will contact the master gearman manager to request a number of the workers' specific actions completed. Any worker does it, and your requester gets the result back.

If your requesters don't need a result back but "just" need a task to happen, have instead a look at TheSchwartz which has a similar architecture but does not provide a facility for getting messages from the workers back to the requesters, IIRC.


Check GRID::Machine.


I stumbled upon GNU parallel a week or two ago, while not across separate machines it helps reduce time by allowing regular programs to take advantage of multiple cores. May help speed up whatever you're doing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜