Problem with faceted search

2022-12-19 07:31 问答作者：

I’m doing some faceted searches but have a few problems. I don’t get the desired results when there are several words in the faceted search field.

Example: “animal” field with the following entries:

        A horse

        Black horse

        Black horse

La faceted search sends back "horse(3)" as best result, whereas I would like to get back "Black horse(2)".

And this is the schema.xml. The search field is BUSQUEDA, and the faceted field is SUPERFICIE. I think I have tried most of the posible combinations of the defined types for these two fields but still doesn't work.

<?xml version="1.0" encoding="UTF-8" ?>
        <schema name="example" version="1.2">
         <types>

     <fieldType name="string" class="solr.StrField"/>

    <fieldType name="facet_texPersonal" class="solr.StrField" sortMissingLast="true" omitNorms="true">
           <analyzer>
            <tokenizer class="solr.KeywordTokenizerFactory"/>
           </analyzer>
          </fieldType>

          <fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true" omitNorms="true">
           <analyzer>
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.TrimFilterFactory" />
           </analyzer>
          </fieldType>

          <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
           <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
             enablePositionIncrements="true"/>
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
             catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
           </analyzer>
           <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
             enablePositionIncrements="true"/>
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
             catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
           </analyzer>
          </fieldType>

          <fieldType name="textTight" class="solr.TextField" positionIncrementGap="100" >
            <analyzer>
           <tokenizer class="solr.WhitespaceTokenizerFactory"/>
           <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
           <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
           <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0"        catenateWords="1" catenateNumbers="1" catenateAll="0"/>
           <filter class="solr.LowerCaseFilterFactory"/>
           <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
           <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
          </fieldType>

          <fieldType name="textMultidioma" class="solr.TextField" positionIncrementGap="100">
           <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
              enablePositionIncrements="true" />
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
              catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
            <filter class="solr.LowerCaseFilterFactory"/>
           </analyzer>
           <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" 
             catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
            <filter class="solr.LowerCaseFi开发者_JAVA技巧lterFactory"/>
           </analyzer>
          </fieldType>

         </types>

         <fields>
          <field name="BUSQUEDA" type="facet_tex" indexed="true" stored="true"/>
          <field name="SUPERFICIE" type="facet_tex" indexed="true" stored="true"/>
          <field name="NOMBRE" type="string" indexed="true" stored="true"/>
         </fields>
         <uniqueKey>NOMBRE</uniqueKey>
         <defaultSearchField>BUSQUEDA</defaultSearchField></schema>

Any suggestions?

Thanks a bunch in advance!

You have to facet on a non-tokenized field (field class solr.StrField, or using solr.KeywordTokenizerFactory). This thread explains it in detail.

We had multi-word faceted fields working for a project that I worked on previously. Here is (part of) the schema.xml relating to this:

<schema name="example" version="1.2">
 <types>
  <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" />
    ...
 </types>  
 <fields>
  <field name="grant_type" type="string" indexed="true" stored="true" />
  ...
 </fields>
</schema>

As Mauricio has highlighted the facet field has to be non-tokenized (not split in to separate words). In the config above we are using the 'solr.StrField' (non-tokenized) field type.

Further hints for faceted field types (not converting to lowercase, not stripping out punctuation, etc.) can be found on the Solr Faceting Overview page.

继续阅读：solr

Problem with faceted search

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？