开发者

Which, if any, of the NoSQL databases can provide stream of *changes* to a query result set?

Which, if any, of the NoSQL databases can provide stream of changes to a query result set?

Could anyone point me at some examples?

Firstly, I believe that none of the SQL databases provide this functionality - am I correct?

I need to be able to specify arbitrary, simple queries, whose equivalent in SQL might be written:

SELECT * FROM accounts WHERE balance < 0 and balance > -1000;

I want an an initial result set:

id: 100, name: Fred, balance: -10
id: 103, name: Mary, balance: -200

but then I want a stream of changes to follow, forever, until I stop them:

meta: remove, id: 100
meta: add,    id: 104, name: Alice, balance: -300
meta: remove, id: 103
meta: modify, id: 104, name: Alice, balance: -400
meta: modify, id: 104, name: Alison, balance: -400
meta: add,    id: 101, name: Clive, balance: -200
meta: modify, id: 104, name: Alison, balance: -100
...

Note: I'm not talking about streaming large result sets. I'm looking for a soft-realtime stream of changes.

Also, it needs to scale out,开发者_开发技巧 if possible.

Thanks,

Chris.


CouchDB has a changes feed. Basically it's a block chain, or a history of every change in the database since inception. You can get the feed via JSON, JSONP, long polling or as a continuous stream and write applications that respond to changes in the database.

Here's the changes feed from my blog

To learn more check out this section of the CouchDB guide


Although an answer has been accepted, there is another answer that gets to the heart of the assumptions underneath your question.

What is the business concern that you have related to getting a list of changes to the data? What if, instead of merely getting the list of changes to the data, you received a set of events that told you why and how the data changed.

This concept is one of the fundamental reasons behind "CQRS" as an architecture. Basically you store all events that caused a change to your data, e.g. FundsDeposited, FundsWithdrawn, etc. and you gain the ability to "replay" those events and discover not just how your data changed over time, but why.

Once you go down that road, you gain the ability to store events as a stream and you are no longer limited to a small handful of storage engines. Instead you could literally use any storage engine and it would get the job done.


Not sure if this is exactly the kind of thing you are looking for, but thought it possibly relevant enough to warrant a mention!

If you use replication in MongoDB, all write operations are stored in an oplog (operation log). So every insert/update/delete is recorded in there so that they can be replayed on the secondary nodes. It's a capped collection so cycles round and overwrites itself (you can set it's size). But in theory, this oplog could be used as a way to retrieve a stream of changes - I haven't tried it myself, but possibly you could poll that oplog.


Only a brainstorming answer:

Let's take for example a MongoDB AND do not want to access the changes feed like described above. Yes, it sounds crappy compared to the other answers, but was my first idea before these answers popped up while writing ...

Current features -related to this question- are Capped Collections (http://www.mongodb.org/display/DOCS/Capped+Collections) and maybe Server-side Code Execution (http://www.mongodb.org/display/DOCS/Server-side+Code+Execution).

With capped collections it would be easier to write a lot of data but read less (like log files) - this collection type is made for such cases. The server-side scripts can be used for outsourcing a lot of processing (less app code), but you can leave away this point if you want to completely integrate the logic in your app.

Don't know if there NoSQL DBs with "hooks". I know that's possible in postgres (SQL).

Currently the streaming logic has to be implemented in the app code AFAIK.

In CouchDB it could be possible with "Views" which are not implemented in MongoDB (if this isn't correct, please give me a link, this is a interesting topic, too!).

Don't know if this is helpful. It's my first try of an answer here on SO.


this type of thing should be done in the app, not the database.

Meaning, every time you make a change, it should be recorded as a new record. Not a modification to the record. There's a whole lot more intelligence you can add to your app if you do it this way


As of v.3.6, MongoDB uses Change Streams to allow applications to subscribe to a realtime list of changes:

Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a collection and immediately react to them.

Change streams can benefit architectures with reliant business systems, informing downstream systems once data changes are durable. For example, change streams can save time for developers when implementing Extract, Transform, and Load (ETL) services, cross-platform synchronization, collaboration functionality, and notification services.

By default, a stream returns changes to all documents in a collection, but you can add an agregation pipeline to filter to only the documents which match your query result set.


If recieving all changes (not only changes to a query result set) is accepteble, then you can create mongodb replication slave, and recieve all changes from master. I've seen mongodb replication slave written even in php, so it should not be too hard to implement that.


mongoDB implements a tailable-cursor, but for capped collections only. See the docs. It may be of use depending on your specific requirements.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜