Batch Job Dependencies Using Open Source/Free Software
I run a large data warehouse plant where we have a lot of nightly jobs running concerruently however many have dependencies on a extract or data load process before they start. Currently we use an 'expensive scheduling system' to scehdule these at the moment.
Is there any way you can setup job dependencies using an open source or free unix/linux tool such as cron?
Moving to an open soru开发者_如何转开发ce solution would be great and save us lots!
Regards Matt
Cfengine can be made to do something like this. You can set it up as a cron replacement, running arbitrary commands at scheduled times, and you can set up "classes" so that certain actions are performed only if certain classes are enabled. Classes can be anything from "this is a Linux system" to "it's currently between 5 and 10 minutes after the hour" to "system load is above value x" to "this arbitrary shell command that I just specified returned true", so you could set up your classes to indicate your job dependencies.
I doubt that this would be as powerful as a scheduling system (dependencies would have to be set up manually by configuring classes, scheduling concurrently would requires extra scripting or configuration work), but it is free and open source.
Version 2 of Cfengine was not particularly pleasant to work with (in the words of Seth Vidal, "it's [sic] syntax kills kittens"). I haven't used Cfengine 3. Puppet has similar design goals as Cfengine and may be easier to work with.
I asked a similar question last year (maybe Serverfault would be a better place these days?). There doesn't seem to be a simple, install-and-go solution unfortunately.
Cron doesn't handle this natively. Can the process that loads the data write out a status file upon completion? This would allow subsequent jobs to check the status file before doing their real work. Obviously, this isn't an ideal solution (too many points of failure, I suspect), but perhaps it's good enough for what you're trying to accomplish.
Schedulix is an open source workload automation solution for Linux: http://www.schedulix.org
精彩评论