Search for partial words using Solr
I'm trying to search for a partial word using Solr, but I can't get it to work.
I'm using this in my schema.xml
file.
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter cl开发者_JS百科ass="solr.PorterStemFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" stemEnglishPossessive="1" splitOnNumerics="1" splitOnCaseChange="1" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="1"/>
</analyzer>
</fieldType>
Searching for die h
won't work, but die hard
returns some results.
I've reindexed the database after the above configuration was added.
Here is the url and output when searching for die hard
. The debugger is turned on.
Here is the url and output when searching for die h
. The debugger is turned on.
I'm using Solr 3.3. Here is the rest of the schema.xml
file.
The query you've shared is searching the "title_text" field, but the schema you posted above defines the "text" field. Assuming this was just an oversight, and the title_text field is defined as in your post, I think a probable issue is that the NGramTokenizer is configured with minGramSize="3", and you are expecting to match using a single-character token.
You could try changing minGramSize to 1, but this will inevitably lead to some very inefficient indexes; and I wonder whether you really are keen on having "e" match every movie with an e in the title?
精彩评论