开发者

Couchdb views and many (thousands) document types

I'm studing CouchDB and I'm picturing a worst case scenario:

for each document t开发者_Python百科ype I need 3 view and this application can generate 10 thousands of document types.

With "document type" I mean the structure of the document.

After insertion of a new document, couchdb make 3*10K calls to view functions searching for right document type.

Is this true? Is there a smart solution than make a database for each doc type?

Document example (assume that none documents have the same structure, in this example data is under different keys):

[
     {
       "_id":"1251888780.0",
       "_rev":"1-582726400f3c9437259adef7888cbac0"
       "type":'sensorX',
       "value":{"ValueA":"123"}
     },
     {
       "_id":"1251888780.0",
       "_rev":"1-37259adef7888cbac06400f3c9458272"
       "type":'sensorY',
       "value":{"valueB":"456"}
     },
     {
       "_id":"1251888780.0",
       "_rev":"1-6400f3c945827237259adef7888cbac0"
       "type":'sensorZ',
       "value":{"valueC":"789"}
     },
   ]

Views example (in this example only one per doc type)

  "views":
  {
    "sensorX": {
      "map": "function(doc) { if (doc.type == 'sensorX')  emit(null, doc.valueA) }"
    },
    "sensorY": {
      "map": "function(doc) { if (doc.type == 'sensorY')  emit(null, doc.valueB) }"
    },
    "sensorZ": {
      "map": "function(doc) { if (doc.type == 'sensorZ')  emit(null, doc.valueC) }"
    },
  }


The results of the map() function in CouchDB is cached the first time you request the view for each new document. Let me explain with a quick illustration.

  • You insert 100 documents to CouchDB

  • You request the view. Now the 100 documents have the map() function run against them and the results cached.

  • You request the view again. The data is read from the indexed view data, no documents have to be re-mapped.

  • You insert 50 more documents

  • You request the view. The 50 new documents are mapped and merged into the index with the old 100 documents.

  • You request the view again. The data is read from the indexed view data, no documents have to be re-mapped.

I hope that makes sense. If you're concerned about a big load being generated when a user requests a view and lots of new documents have been added you could look at having your import process call the view (to re-map the new documents) and have the user request for the view include stale=ok.

The CouchDB book is a really good resource for information on CouchDB.


James has a great answer.

It looks like you are asking the question "what are the values of documents of type X?"

I think you can do that with one view:

function(doc) {
    // _view/sensor_value
    var val_names = { "sensorX": "valueA"
                    , "sensorY": "valueB"
                    , "sensorZ": "valueC"
                    };

    var value_name = val_names[doc.type];
    if(value_name) {
        // e.g. "sensorX" -> "123"
        // or "sensorZ" -> "789"
        emit(doc.type, doc.value[value_name]);
    }
}

Now, to get all values for sensorY, you query /db/_design/app/_view/sensor_value with a parameter ?key="sensorX". CouchDB will show all values for sensorX, which come from the document's value.valueA field. (For sensorY, it comes from value.valueB, etc.)

Future-proofing

If you might have new document types in the future, something more general might be better:

function(doc) {
     if(doc.type && doc.value) {
         emit(doc.type, doc.value);
     }
 }

That is very simple, and any document will work if it has a type and value field. Next, to get the valueA, valueB, etc. from the view, just do that on the client side.

If using the client is impossible, use a _list function.

function(head, req) {
    // _list/sensor_val
    //
    start({'headers':{'Content-Type':'application/json'}});

    // Updating this will *not* cause the map/reduce view to re-build.
    var val_names = { "sensorX": "valueA"
                    , "sensorY": "valueB"
                    , "sensorZ": "valueC"
                    };


    var row;
    var doc_type, val_name, doc_val;
    while(row = getRow()) {
        doc_type = row.key;
        val_name = val_names[doc_type];
        doc_val = row.value[val_name];
        send("Doc " + row.id + " is type " + doc_type + " and value " + doc_val);
    }
}

Obviously use send() to send whichever format you prefer for the client (such as JSON).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜