开发者

How to do partial beginning matches in Solr?

I'm trying to search for partial beginning matches on a big list of lastnames. So Wein* should find Weinberg, Weinkamm etc.

I could do this by creating a special field, and adding

<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="50" pre开发者_如何学CserveOriginal="1"/>

to its type specification in schema.xml. When I add the line above only to the indexing analyzer and leave it empty for the query analyzer, I can then search by just search special_field:Wein and get the expected results.

Now I see that solr also has a *-syntax. What's the connection between EdgeNGramFilterFactory and the *-syntax?

Am I doing things correctly or is there a better, more regular way?

Thanks!


Or just do a simple wild card match:

name:Pe*


I don't recommend the Wein* query. That is implemented internally as PrefixQuery, which rewrites the original query to include all terms that have prefix equals "Wein". Depending on how large is your index (I mean how many terms), this query rewriting can be a bottleneck.

The EdgeNGramFilter at index time is a better approach. This solution will use more space, but queries will be processed much faster.


Note: I also asked this question in the Lucene forum where I got a good answer: http://lucene.472066.n3.nabble.com/How-to-do-partial-beginning-matches-td781147.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜