开发者

Can we implement Solr auto suggest without storing the field but only indexing

I am suppose to implement Google-like auto suggest/complete using开发者_运维技巧 Solr. I have 2 questions related with this:

  1. Is it possible that we only index but do not store a field on which auto complete is suppose to run or terms component suppose to run!

  2. Can we use multiple fields for fetching data to populate auto suggest and if yes then can these fields be used only as index and not as stored!

I would be great full if anyone has applied this and tried such implementation and can help me out.

Thanks Saif


In Solr 4.0 there is a new component called Suggester. It uses the spellcheck component to build suggestions based on your existing index.

Suggester - Solr Wiki

I'm still tweaking my field type for the Suggester component but here is what I have so far which seems to be working quite well.

    <fieldtype name="textSuggest" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.PatternReplaceFilterFactory"
               pattern="(^[^A-Za-z0-9]*|[^A-Za-z0-9]*$)" replacement=""  replace="all" />
            <filter class="solr.LengthFilterFactory" min="2" max="60" />
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="false" />
            <filter class="solr.ShingleFilterFactory" maxShingleSize="3" outputUnigrams="true" outputUnigramIfNoNgram="true" />
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="false" />
            <filter class="solr.ShingleFilterFactory" maxShingleSize="99" outputUnigrams="false" outputUnigramIfNoNgram="true" />
        </analyzer>
    </fieldtype>


You are looking for the Solr TermsComponent which can be queried to return the terms present in an indexed field, along with their frequencies. Specifically, you want the terms.prefix parameter, which will return all terms that start with the prefix you specify.


In SOLR for autocomplete you have to define the ngram field you are searching against as stored so that the values are returned.

Also I think there is no way in SOLR that you can fetch data from multiple fields and pull a single ngram from it. Simpler way would be to create one field and copy the data from all other fields you want to use into that one field and then apply ngram tokenizing to it.


solr allows copyfield to be used as index for autocomplete. this is example from my work in solr auto-suggest/auto-complete, <copyField source="name" dest="text"/>. Clearly that solr will only index the field that you will use in autocomplete.

In the other hand, you cannot retrieve fields unless it is stored, and here I mean the orignal feilds not the copied ones. What I propose is to copy the searchable field ex:name, and then retrieve the all other field based on the query it self. You need to create custom of search handler and request handler.

I'll edit this with full solution, later on.

you can use this article to learn more about the subject and then extend your solution. http://solr.pl/en/2010/10/18/solr-and-autocomplete-part-1/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜