开发者

How can elastic search find words in attachment file of couchdb?

Hi Please give me the indication./ I am using the elasticsearch 0.17.6 and couchdb 1.1.0

I have created two documents on couchdb: Each document have the string fields: name, message. The first one is attached by a text file "test.txt" and the second one is not. The JSon code generated by CouchDB is like that:

{
  "_id": "ID1",
  "_rev": "6-e1ab4c5c65b98e9a0d91e5c8fc1629bb",
  "name": "Document1",
  "message": "Evaluate Elastic Search",
  "_attachments":   {
     "test.txt": {
       "content_type": "text/plain",
       "revpos": 5,
       "digest": "md5-REzvAVEZoSV69SLI/vaflQ==",
       "length": 86,
       "stub": true
     }
  }
}

{

 "_id": "ID2",
 "_rev": "2-72142ec18248cedb4dba67305d136aa8",
 "name": "Document2",
 "message": "test Elastic Search"
}

These two documents are in a database called my_test_couch_db

I have use Elasticsearch (ES) to index these documents using plugins: river and mapper-attachments. For each given text, I expect that ES can find, not only corresponding text in document's fields, but also in the attachment *.txt file. But it is impossible. I try many ways:I have created index manually, mapping (automatically and manually), configure river, etc. but ES can only find words in document's fields, it cannot find the ones in *.txt attachment files. I follow the indication of site http://www.elasticsearch.org but it doesnot work, either.

Thanks for your answers.

Here is my commands:

curl -X PUT "localhost:9200/test_idx_1"

curl -X PUT "localhost:9200/test_idx_1/test_mapping_1/_mapping" -d '{
  "test_mapping_1": {
    "properties": {
      "_attachments": {
        "type": "attachment",
        "index": "yes"
      }
    }
  }
}'

curl -XPUT 'http://localhost:9200/_river/test_river_1/_meta' -d '{
  "type": "couchdb",
  "couchdb": {
    "host": "localhost",
    "port": 5984,
    "db": "my_test_couch_db",
    "filter": null
  },
  "index": {
    "index": "test_idx_1",
    "type": "test_mapping_1"
  }
}'

then, I try to search

curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search'

(two documents are find very well )

curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search' -d '{
  "query": {
    "text": {
      "_all": "test"
    }
  }
}'

Here is the output

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.081366636,
    "hits": [
      {
        "_index": "my_test_couch_db",
        "_type": "my_test_couch_db",
        "_id": "ID2",
        "_score": 0.081366636,
        "_source": {
          "message": "test Elastic Search",
          "_rev": "2-72142ec18248cedb4dba67305d136aa8",
          "_id": "ID2",
          "name": "Document2"
     开发者_如何转开发   }
      }
    ]
  }
}

As you see, the ES can only find the word "test" in the message field, they cannot find this word in the *.text attachment files.

I try the other queries:

curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search' -d '{
  "query": {
    "text": {
      "_attachments": "test"
    }
  }
}'

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search' -d '{
  "query": {
    "text": {
      "_attachments.fields.file": "test"
    }
  }
}'

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

The output is nothing. I try other mappings but it doesn't work, either.

Why is that and how to solve this problem?


Attachment are not yet loaded by couchDb river. I have updated it but still waiting for users that it works fine.

See https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments You can try it here : https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/elasticsearch-river-couchdb-1.2.0-SNAPSHOT.zip

If it works fine for you, I can create the pull request.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜