开发者

How do I handle word forms in sphinx search

I have a sphinx server to index a mysql database for a django app. My search is working fine but my content includes medical words/phrases. So, for example, I need a search for "dvt" to also match against "deep veno开发者_StackOverflow社区us thrombosis" and even "deep vein thrombosis". I looked through the documentation and see an option for "wordforms" and "morphology". Which of these (or something else) should I use? Also, what will work backwards? ie, a search for "deep venous thrombosis"/"deep vein thrombosis" will match against "dvt".

Also, I would appreciate some advice on how to set these up since I'm new to sphinx in general.


You will need to provide your own list of word/term synonyms to be used in query expansion.

Since Sphinx does not currently support synonym expansion in queries, you'll need to massage the query based on your list of synonyms before submitting it to the search engine.

So, using your example:

  • User queries for: 'dvt remediation procedures'.

  • Server receives query and checks each term against its list of synonyms.

  • Server finds a match and adds 'deep vein thrombosis' to query.

  • Server submits newly expanded query 'dvt deep vein thrombosis remediation procedures' to search engine.

Finally, if the stemmer built into Sphinx is doing its job, you shouldn't have to support both 'venous' and 'vein' as separate terms since they both should stem to the same term. If this is not the case, you might need to do additional pre-stemming to handle words specific to your corpora (medical terms).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜