Django-Haystack with Solr contains search
I am using haystack
within a project using solr
as the backend. I want to be able to perform a contains search, similar to the Django .filter(something__contains="...")
The __startswith
option does not suit our needs as it, as the name suggests, looks for words that start 开发者_运维知识库with the string.
I tried to use something like *keyword*
but Solr does not allow the *
to be used as the first character
Thanks.
To get "contains" functionallity you can use:
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" side="back"/>
<filter class="solr.LowerCaseFilterFactory" />
as index analyzer.
This will create ngrams for every whitespace separated word in your field. For example:
"Index this!" => x, ex, dex, ndex, index, !, s!, is!, his!, this!
As you see this will expand your index greatly but if you now enter a query like:
"nde*"
it will match "ndex" giving you a hit.
Use this approach carefully to make sure that your index doesn't get too large. If you increase minGramSize, or decrease maxGramSize it will not expand the index as mutch but reduce the "contains" functionallity. For instance setting minGramSize="3" will require that you have at least 3 characters in your contains query.
You can achieve the same behavior without having to touch the solr schema. In your index, make your text field an EdgeNgramField instead of a CharField. Under the hood this will generate a similar schema to what lindstromhenrik suggested.
I am using an expression like: .filter(something__startswith='...') .filter_or(name=''+s'...') as is seems solr does not like expression like '...*', but combined with or will do
None of the answers here do a real substring search *keyword*
.
They don't find the keyword that is part of a bigger string, (not a prefix or suffix).
Using EdgeNGramFilterFactory
or the EdgeNgramField
in the indexes can only do a "startswith" or a "endswith" type of filtering.
The solution is to use a NgramField like this:
class MyIndex(indexes.SearchIndex, indexes.Indexable):
...
field_to_index= indexes.NgramField(model_attr='field_name')
...
This is very elegant, because you don't need to manually add anything to the schema.xml
精彩评论