Selective replication with CouchDB
I'm currently evaluating possible solutions to the follwing problem:
A set of data entries must be synchonized between multiple clients, where each client may only view (or even know about the existence of) a subset of the data. Each client "owns" some of the elements, and the decision who else can read or modify those elements may only be made by the owner. To complicate this situation even more, each element (and each element revision) must have an unique identifier that is equal for all clients.
开发者_高级运维While the latter sounds like a perfect task for CouchDB (and a document based data model would fit my needs perfectly), I'm not sure if the authentication/authorization subsystem of CouchDB can handle these requirements: While it should be possible to restict write access using validation functions, there doesn't seem to be a way to authorize read access. All solutions I've found for this problem propose to route all CouchDB requests through a proxy (or an application layer) that handles authorization.
So, the question is: Is it possible to implement an authorization layer that filters requests to the database so that access is granted only to documents that the requesting client has read access to and still use the replication mechanism of CouchDB? Simplified, this would be some kind of "selective replication" where only some of the documents, and not the whole database is replicated.
I would also be thankful for directions to some detailed information about how replication works. The CouchDB wiki and even the "Definite Guide" Book are not too specific about that.
this begs for replication filters. you filter outbound replication based on whatever criteria you impose, and give the owner of the target unrestricted access to their own copy.
i haven't had the opportunity to play with replication filters directly, but the idea would be that each doc would have some information about who has access to it, and the filtering mechanism would then allow outbound replication of only those documents that you have access to. replication from the target back to the master would be unrestricted, allowing for the master to remain a rollup copy, and potentially multicast changes to overlapping sets of data.
What you are after is replication filters. According to Chris Anderson, it is a 0.11 feature.
"The current status is that there is an API for filtering the _changes feed. The replicator in 0.10 consumes the changes feed, so the next step is getting the replicator to use the filter API.
There is work in progress on this, so it should be fully ready to go in 0.11."
See the orginal post
Here is a new link to the some documentation about this:
http://blog.couchbase.com/what%E2%80%99s-new-apache-couchdb-011-%E2%80%94-part-three-new-features-replication
Indeed, as others have said, replication filters are the way to go for this. Here is a link with some information on using them.
One caveat I would add is that at scale replication filters can be extremely slow. More information about this and other nuances about couchdb can be found in this excellent blog post: "what every developer should know about couchdb". For large scale systems performing replication in the application layer has proven faster and more reliable.
精彩评论