Full Join/Intersection in couchdb
I have some documents which have 2 sets of attributes: tag and lieu. Here is an example of what they look like:
{
title: "doc1",
tag: ["mountain", "sunny", "forest"],
lieu: ["france", "luxembourg"]
},
{
title: "doc2",
tag: ["sunny", "lake"],
lieu: ["france", "germany"]
},
{
title: "doc3",
tag: ["sunny"],
lieu: ["belgium", "luxembourg", "france"]
}
How can I map/reduce and query my DB to be able to retrieve only the intersection of d开发者_StackOverflow社区ocuments that match these criteria:
- lieu: ["france", "luxembourg"]
- tag: ["sunny"]
Returns: doc1 and doc3
I cannot figure out any format map/reduce could return to be able to have only one query. What I am doing now is: emit every lieu/tag as key and the documents' id related as value, then reduce for every keys have an array of docs' ids. Then from my app I query this view, on the app side do an intersection of the documents (only take the docs that have the 3 keys (luxembourg, france and sunny) and then requery couchdb with these docs' ids to retrieve the actual docs. I feel that's not the right/best way to do it?
I am using lists to do the intersection job, it works quite well. But I still need to do an other request to get the documents using the documents ids. Any idea what could I do differently to retrieve the documents directly?
Thank you!
This is going to be awkward. The basic idea is that you have to build a view where the map function emits every possible combination of tags and countries as the key, and there's no reduce function. This way, looking for ["france","luxembourg"]
would return all documents that emitted that key (and therefore are in the intersection), because views without a reduce function return the emitting document for every entry. This way, you only have to do one request.
This causes a lot of emits to happen, but you can lower that number by sorting the tags both when emitting and when searching (automatically turn ["luxembourg","france"]
into ["france","luxembourg"]
), and by taking advantage of the ability of CouchDB to query prefixes (this means that emitting ["belgium","france","luxembourg"]
will let you match searches for ["belgium"]
and ["belgium","france"]
).
In your example above, for the countries, you would only emit:
// doc 1
emit(["luxembourg"],null);
emit(["france","luxembourg"],null);
// doc 2
emit(["germany"],null);
emit(["france","germany"],null);
// doc 3
emit(["luxembourg"],null);
emit(["belgium","luxembourg"],null);
emit(["france","luxembourg"],null);
emit(["belgium","france","luxembourg"],null);
Anyway, for complex queries like this one, consider looking into a CouchDB-Lucene combination.
精彩评论