RabbitMQ as a proxy between a data store and a producer?
I have some code t开发者_StackOverflowhat produces lots of data that should be stored in the database. The problem is that the database can't keep with the data that it gets produced. So I am wondering whether some kind of queuing mechanism would help in this situation - I am thinking in particular at RabiitMQ and whether is feasible to have the data stored in its queues until some consumer gets the data out of it and pushes it to the database. Also, I am not particular interested whether that data made it to the database or not because pretty soon, the same data will be updated.
@hyperboreean It may sound a bit glib but perhaps what you really need is a cache such as Redis or MemcacheD ?
Technically you could use RabbitMQ with consumers updating the DB but you'd need to implement a "queue cleaning" mechanism or your queues will grow ever larger as long as your input rate still exceeds what the database can handle. As the queues grow, the data in them becomes stale - meaning that the update that was just submitted is still sitting in queue. Think of it like a store that has one checker. Sure you could form separate lines but that just means you have multiple long lines and still have one checker. You are still bound by the rate that the checker can process your customers.
From the all too brief description it sounds like your data is really transient data and a cache system (or other NoSQL-like arrangement) may be a better fit. If you do need to persist the data eventually, you could have a separate process that pulls the current data from the cache mechanism and loads it into your DB. Then you would be limited by how long it took to extract it vs. how often you can actually load the data into the DB .
Databases are supposed to handle data inserts very fast, without locking mechanisms since inserts are about data that does not yet exist on the store. If you are dealing with data inserts and your serialization to the database is a bottleneck, then whatever problem you have with that will still exist with RabbitMQ, because database inserts should perform faster than outbound messaging to RabbitMQ. In this scenario RabbitMQ will not resolve your problem. On the other hand, data updates will lock the updating row (in general) and you hay be having concurrency issues with locks and waits. So overall, try to understand why your database persistence is a bottleneck.
Eventually, if your data store is NOSQL, then it might be not a performing write, in this case you can analyze what receives data faster (NoSQL vs RabbitMQ).
If you are having data producers on multiple threads, then you have a concurrency issue to write to the persistence store. In this case RAbbitMQ should handle concurrency better that your persistence store, since designed for high concurrency. This depends on what data-store you are using.
精彩评论