开发者

Get text snippet from search index generated by solr and nutch

I have just configured nutch and solr to successfully crawl and index text on a web site, by following the geting started tutorials. Now I am trying to make a search page by modifying the example velocity templates.

Now to my question. How can I tell solr to provide a relevant text snippet of the content of the hits? I only get the following fields associated with each hit:

score, boost, digest, id, segment, title, date, tstamp and url.

The content is really indexed, because I can search for words that I know only is in the fulltext, but I still don't get the fulltext back associated with 开发者_如何转开发the hit.


don't forget: indexed is not the same as stored.

You can search words in an document, if all field are indexed, but no field is stored. To get the content of a specific field, it must be also stored=true in schema.xml

If your fulltext-field is stored, so probably the default "field-list-settings" does not include the fulltext-field. You can add this by using the fl parameter:

http://<solr-url>:port/select/?......&fl=mytext,*

...this example, if your fulltext is stored in the field called mytext

Finally, if you like to have only a snippet of the text with the searched words (not the whole text) look at the highlight-component from solr/lucene

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜