How can elastic search find words in attachment file of couchdb?
Hi Please give me the indication./ I am using the elasticsearch 0.17.6 and couchdb 1.1.0
I have created two documents on couchdb: Each document have the string fields: name, message. The first one is attached by a text file "test.txt" and the second one is not. The JSon code generated by CouchDB is like that:
{
"_id": "ID1",
"_rev": "6-e1ab4c5c65b98e9a0d91e5c8fc1629bb",
"name": "Document1",
"message": "Evaluate Elastic Search",
"_attachments": {
"test.txt": {
"content_type": "text/plain",
"revpos": 5,
"digest": "md5-REzvAVEZoSV69SLI/vaflQ==",
"length": 86,
"stub": true
}
}
}
{
"_id": "ID2",
"_rev": "2-72142ec18248cedb4dba67305d136aa8",
"name": "Document2",
"message": "test Elastic Search"
}
These two documents are in a database called my_test_couch_db
I have use Elasticsearch (ES) to index these documents using plugins: river and mapper-attachments. For each given text, I expect that ES can find, not only corresponding text in document's fields, but also in the attachment *.txt file. But it is impossible. I try many ways:I have created index manually, mapping (automatically and manually), configure river, etc. but ES can only find words in document's fields, it cannot find the ones in *.txt attachment files. I follow the indication of site http://www.elasticsearch.org but it doesnot work, either.
Thanks for your answers.
Here is my commands:
curl -X PUT "localhost:9200/test_idx_1"
curl -X PUT "localhost:9200/test_idx_1/test_mapping_1/_mapping" -d '{
"test_mapping_1": {
"properties": {
"_attachments": {
"type": "attachment",
"index": "yes"
}
}
}
}'
curl -XPUT 'http://localhost:9200/_river/test_river_1/_meta' -d '{
"type": "couchdb",
"couchdb": {
"host": "localhost",
"port": 5984,
"db": "my_test_couch_db",
"filter": null
},
"index": {
"index": "test_idx_1",
"type": "test_mapping_1"
}
}'
then, I try to search
curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search'
(two documents are find very well )
curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search' -d '{
"query": {
"text": {
"_all": "test"
}
}
}'
Here is the output
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.081366636,
"hits": [
{
"_index": "my_test_couch_db",
"_type": "my_test_couch_db",
"_id": "ID2",
"_score": 0.081366636,
"_source": {
"message": "test Elastic Search",
"_rev": "2-72142ec18248cedb4dba67305d136aa8",
"_id": "ID2",
"name": "Document2"
开发者_如何转开发 }
}
]
}
}
As you see, the ES can only find the word "test" in the message field, they cannot find this word in the *.text attachment files.
I try the other queries:
curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search' -d '{
"query": {
"text": {
"_attachments": "test"
}
}
}'
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
curl -XPOST 'http://localhost:9200/my_test_couch_db/my_test_couch_db/_search' -d '{
"query": {
"text": {
"_attachments.fields.file": "test"
}
}
}'
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
The output is nothing. I try other mappings but it doesn't work, either.
Why is that and how to solve this problem?
Attachment are not yet loaded by couchDb river. I have updated it but still waiting for users that it works fine.
See https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments You can try it here : https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/elasticsearch-river-couchdb-1.2.0-SNAPSHOT.zip
If it works fine for you, I can create the pull request.
精彩评论