Sorting of Field having special characters in SoLR

2023-04-07 04:10 问答作者：

i am new at SoLR indexing. I want to sort location field which have different values.it also contains values which starts with 'sAmerica, #'Japan, %India and etc.

Now when i sort this field i do want to consider special characters like 's,'#,!,~ and etc. i want sorting which will ignore this chars and returns results like America at 1st position, %India at 2nd and #'Japan at 3rd position..

How to make it possbile? i am using PatternReplaceFilterFactory,but don't know about this.

  <analyzer type="query">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.WordDelimiterFilterFactory" c开发者_运维百科atenateWords="1"  />
    <filter class="solr.PatternReplaceFilterFactory" pattern="'s" replacement="" replace="all" />
  </analyzer>
</fieldType>

IF you want to ignore the special characters, try using the following field type.
This would lower case the words and catenate the words excluding all special chars.

    <fieldType name="string_sort" class="solr.TextField" positionIncrementGap="1">
        <analyzer type="index">
            <tokenizer class="solr.KeywordTokenizerFactory" />
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.WordDelimiterFilterFactory" catenateWords="1" />
        </analyzer>
    </fieldType>

However, this would not work for 'sAmerica as s is not a special character.

<filter class="solr.PatternReplaceFilterFactory" pattern="'s" replacement="" replace="all" />

If this is fixed pattern you need to replace it before the word delimiter with above.

Edit -- Are you using this config ?

<fieldType name="string_sort" class="solr.TextField" positionIncrementGap="1">
    <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.PatternReplaceFilterFactory" pattern="'s" replacement="" replace="all" />
        <filter class="solr.WordDelimiterFilterFactory" catenateWords="1" />
    </analyzer>
</fieldType>

Have tested the following through analysis and it produces the following tokens -

KT - 'sAlgarve
LCF - 'salgarve
PRF - algarve
WDF - algarve

Can you check through the analysis.

继续阅读：solr solrnet

Sorting of Field having special characters in SoLR

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？