Rails Queue Management

2023-03-10 19:35 问答作者：

I am building a job that is going to fetch and re-validate information from a remote website. I actually have it already implemented with a queue that works kinda like this: text file is read then sliced up into 5k increments and handed off to thread processors, that then quit and a new worker is generated.

I am looking into resque, but had a generic kind of design question about problems like this. So if I have a job that could potentially be 5-20M units of work, what is the best practice for storing the queue? For instance, I could theoretically chunk the work up and store it, then create开发者_运维技巧 a job for that chunk, or I could have 5-20M individual line items in the queue. It would seem like there is a lot of overhead in the work being fetched/regenerated. But then there is also decent overhead, and more coding, to try chunking the work.

Based on what we've done and seen, a good approach is to chunk the work at runtime and not prior. In other words, a master/slave pattern that is event or time-driven with the master slicing up the work/data space into granular tasks/chunks when it gets queued and run.

The reason for this is that viewing jobs in the schedule is much easier when done at a coarse grain level. At this level, the jobs correspond to the units that you're tracking (webpages, a user profile, or streaming data from a sensor, for example).

We often see slicing on a fine grained level but then see each worker working on a reasonable collection of tasks. We've found that having each worker process multiple tasks (20-1000? depending on the type/length of task) provides a good balance between:

optimizing setup (establishing a database connection for example)
providing good introspection into the jobs
making retries and exception handling more manageable

You'd want to have the processing time for each worker be in minutes as opposed to long running tasks just so you have more visibility into worker performance and so that retries only affect a limited amount of the work space. Making use of a NoSQL solution (esp. database-as-a-service ones like MongoHQ or MongoLabs) can allow you to easily keep track and manage the chunking and in-process work.

Another recommendation is to create workers that are independent of your application environment. This means writing each worker to be reasonably self contained as well as using callbacks, database flags, and other asynchronous approaches. It may be a bit more work but just like a MVC application design, it gives you much greater agility plus allows the work to be distributed over elastic worker systems.

(Full disclosure: I'm on the team at Iron.io, maker of IronMQ, IronWorker, and IronCache.)

继续阅读：delayed-job resque ruby ruby-on-rails-3

Rails Queue Management

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？