Couchdb views and many (thousands) document types

2023-03-05 06:54 问答作者：

I'm studing CouchDB and I'm picturing a worst case scenario:

for each document t开发者_Python百科ype I need 3 view and this application can generate 10 thousands of document types.

With "document type" I mean the structure of the document.

After insertion of a new document, couchdb make 3*10K calls to view functions searching for right document type.

Is this true? Is there a smart solution than make a database for each doc type?

Document example (assume that none documents have the same structure, in this example data is under different keys):

[
     {
       "_id":"1251888780.0",
       "_rev":"1-582726400f3c9437259adef7888cbac0"
       "type":'sensorX',
       "value":{"ValueA":"123"}
     },
     {
       "_id":"1251888780.0",
       "_rev":"1-37259adef7888cbac06400f3c9458272"
       "type":'sensorY',
       "value":{"valueB":"456"}
     },
     {
       "_id":"1251888780.0",
       "_rev":"1-6400f3c945827237259adef7888cbac0"
       "type":'sensorZ',
       "value":{"valueC":"789"}
     },
   ]

Views example (in this example only one per doc type)

  "views":
  {
    "sensorX": {
      "map": "function(doc) { if (doc.type == 'sensorX')  emit(null, doc.valueA) }"
    },
    "sensorY": {
      "map": "function(doc) { if (doc.type == 'sensorY')  emit(null, doc.valueB) }"
    },
    "sensorZ": {
      "map": "function(doc) { if (doc.type == 'sensorZ')  emit(null, doc.valueC) }"
    },
  }

The results of the map() function in CouchDB is cached the first time you request the view for each new document. Let me explain with a quick illustration.

You insert 100 documents to CouchDB
You request the view. Now the 100 documents have the map() function run against them and the results cached.
You request the view again. The data is read from the indexed view data, no documents have to be re-mapped.
You insert 50 more documents
You request the view. The 50 new documents are mapped and merged into the index with the old 100 documents.
You request the view again. The data is read from the indexed view data, no documents have to be re-mapped.

I hope that makes sense. If you're concerned about a big load being generated when a user requests a view and lots of new documents have been added you could look at having your import process call the view (to re-map the new documents) and have the user request for the view include stale=ok.

The CouchDB book is a really good resource for information on CouchDB.

James has a great answer.

It looks like you are asking the question "what are the values of documents of type X?"

I think you can do that with one view:

function(doc) {
    // _view/sensor_value
    var val_names = { "sensorX": "valueA"
                    , "sensorY": "valueB"
                    , "sensorZ": "valueC"
                    };

    var value_name = val_names[doc.type];
    if(value_name) {
        // e.g. "sensorX" -> "123"
        // or "sensorZ" -> "789"
        emit(doc.type, doc.value[value_name]);
    }
}

Now, to get all values for sensorY, you query /db/_design/app/_view/sensor_value with a parameter ?key="sensorX". CouchDB will show all values for sensorX, which come from the document's value.valueA field. (For sensorY, it comes from value.valueB, etc.)

Future-proofing

If you might have new document types in the future, something more general might be better:

function(doc) {
     if(doc.type && doc.value) {
         emit(doc.type, doc.value);
     }
 }

That is very simple, and any document will work if it has a type and value field. Next, to get the valueA, valueB, etc. from the view, just do that on the client side.

If using the client is impossible, use a _list function.

function(head, req) {
    // _list/sensor_val
    //
    start({'headers':{'Content-Type':'application/json'}});

    // Updating this will *not* cause the map/reduce view to re-build.
    var val_names = { "sensorX": "valueA"
                    , "sensorY": "valueB"
                    , "sensorZ": "valueC"
                    };


    var row;
    var doc_type, val_name, doc_val;
    while(row = getRow()) {
        doc_type = row.key;
        val_name = val_names[doc_type];
        doc_val = row.value[val_name];
        send("Doc " + row.id + " is type " + doc_type + " and value " + doc_val);
    }
}

Obviously use send() to send whichever format you prefer for the client (such as JSON).

继续阅读：optimization organization view

Couchdb views and many (thousands) document types

Future-proofing

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Future-proofing

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？