Couchdb views and many (thousands) document types
I'm studing CouchDB and I'm picturing a worst case scenario:
for each document t开发者_Python百科ype I need 3 view and this application can generate 10 thousands of document types.
With "document type" I mean the structure of the document.
After insertion of a new document, couchdb make 3*10K calls to view functions searching for right document type.
Is this true? Is there a smart solution than make a database for each doc type?
Document example (assume that none documents have the same structure, in this example data is under different keys):
[
{
"_id":"1251888780.0",
"_rev":"1-582726400f3c9437259adef7888cbac0"
"type":'sensorX',
"value":{"ValueA":"123"}
},
{
"_id":"1251888780.0",
"_rev":"1-37259adef7888cbac06400f3c9458272"
"type":'sensorY',
"value":{"valueB":"456"}
},
{
"_id":"1251888780.0",
"_rev":"1-6400f3c945827237259adef7888cbac0"
"type":'sensorZ',
"value":{"valueC":"789"}
},
]
Views example (in this example only one per doc type)
"views":
{
"sensorX": {
"map": "function(doc) { if (doc.type == 'sensorX') emit(null, doc.valueA) }"
},
"sensorY": {
"map": "function(doc) { if (doc.type == 'sensorY') emit(null, doc.valueB) }"
},
"sensorZ": {
"map": "function(doc) { if (doc.type == 'sensorZ') emit(null, doc.valueC) }"
},
}
The results of the map()
function in CouchDB is cached the first time you request the view for each new document. Let me explain with a quick illustration.
You insert 100 documents to CouchDB
You request the view. Now the 100 documents have the
map()
function run against them and the results cached.You request the view again. The data is read from the indexed view data, no documents have to be re-mapped.
You insert 50 more documents
You request the view. The 50 new documents are mapped and merged into the index with the old 100 documents.
- You request the view again. The data is read from the indexed view data, no documents have to be re-mapped.
I hope that makes sense. If you're concerned about a big load being generated when a user requests a view and lots of new documents have been added you could look at having your import process call the view (to re-map the new documents) and have the user request for the view include stale=ok
.
The CouchDB book is a really good resource for information on CouchDB.
James has a great answer.
It looks like you are asking the question "what are the values of documents of type X?"
I think you can do that with one view:
function(doc) {
// _view/sensor_value
var val_names = { "sensorX": "valueA"
, "sensorY": "valueB"
, "sensorZ": "valueC"
};
var value_name = val_names[doc.type];
if(value_name) {
// e.g. "sensorX" -> "123"
// or "sensorZ" -> "789"
emit(doc.type, doc.value[value_name]);
}
}
Now, to get all values for sensorY
, you query /db/_design/app/_view/sensor_value
with a parameter ?key="sensorX"
. CouchDB will show all values for sensorX, which come from the document's value.valueA
field. (For sensorY, it comes from value.valueB
, etc.)
Future-proofing
If you might have new document types in the future, something more general might be better:
function(doc) {
if(doc.type && doc.value) {
emit(doc.type, doc.value);
}
}
That is very simple, and any document will work if it has a type
and value
field. Next, to get the valueA
, valueB
, etc. from the view, just do that on the client side.
If using the client is impossible, use a _list
function.
function(head, req) {
// _list/sensor_val
//
start({'headers':{'Content-Type':'application/json'}});
// Updating this will *not* cause the map/reduce view to re-build.
var val_names = { "sensorX": "valueA"
, "sensorY": "valueB"
, "sensorZ": "valueC"
};
var row;
var doc_type, val_name, doc_val;
while(row = getRow()) {
doc_type = row.key;
val_name = val_names[doc_type];
doc_val = row.value[val_name];
send("Doc " + row.id + " is type " + doc_type + " and value " + doc_val);
}
}
Obviously use send()
to send whichever format you prefer for the client (such as JSON).
精彩评论