开发者

batch processing with dependencies in python

I'm looking for the best way to go about creating a job scheduler for different types of jobs. CJobs are threaded and some jobs need to finish before the next step in the process can run. This is currently all managed through a database table...which I think is fine. But if there's a better way to manage dependencies, I'm all ears.

Preferably I'd like to do this in python. I see there's parallel python module which looks great but am concerned about this dependency issue between 开发者_StackOverflow中文版jobs.

Can someone recommend anything that does what I need to do or how to go about doing this?

Much thanks!

D

UPDATE: This is to be done over a cluster of servers each with a limited set of available workers...1 per port. Does Celery or SCon support this?


Luigi looks very interesting. It allows you to create workflows - sets of related jobs, which dependencies are managed by Luigi. It also has simple web interface providing dependency graph.


I've had a lot of success with Celery


SCons can be helpful for this.

It's biased toward software construction (compiling, linking, etc.) but you can easily define new result classes, new commands and new source classes so that it will process your data (and dependencies) properly.

Based on the update, you probably need something like BuiltBot, also.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜