Question about mongodb capped collections + tailable cursors
I'm building a queueing system that passes a message from one process to another via a stack implemented in mongodb with capped_collections and tailable cursors.
The receiving processes loops infinitely looking for new documents in the capped_collection, and when it finds one it performs an operation.
My question is, if I implement multiple receiving processes is there a way to guarantee that a new document will only be read once by one of the processes using a tailable cursor? The goal is to avoid the operation being performed twice if there are two receiving processes looking for new messages in the queue. I'm relatively new to mongodb programming so I'm still getting a feel for all of its featur开发者_Python百科es.
MongoDB documents contain a thorough description of ways to achieve an atomic update. You cannot ensure that only one process receives the new document but you can implement an atomic update after receiving it to ensure that only one process acts on it.
I have recently been looking into this problem and I would be interested to know if there are other ways to have multiple readers (consumers) without relying on atomic updates.
This is what I have come up with: divide your logic into two "modules". The first module will be responsible for fetching new documents from the tailable cursor. The second module will be responsible for working with an arbitrary document. In this manner, you can have only one consumer (module one) fetching documents which later sends the document to multiple document workers (second module).
Both modules can be implemented in different processes and even in different languages. For example, a Node.js app could be fetching the documents and sending them to a pool of scripts written in Python ready to process documents concurrently.
精彩评论