开发者

Django Haystack substring search

I have recently added search capabilities to my django-powered site to allow employers to search for employees using keywords. W开发者_JS百科hen the user initially uploads their resume, I turn it into text, get rid of stop words, and then add the text to a TextField for that user. I used Django-Haystack with the Whoosh search back engine.

Three things-

1) Aside from extra features which I'll probably not use, is there any concrete advantage to switching to Solr or Xapian?

2) In turning the resume into text, I essentially index the pdf myself. I know both Xapian and Solr support .pdf indexing, however, from the looks of it Haystack does not. Any tips on how to get around this? Or should I keep indexing it myself? If so, should I be doing more than simply providing a text file of keywords?

3) Whoosh only return a result if the keyword matches itself exactly. If a user has 'mathematics' as his keyword, and I search 'math', I want that user to appear. I couldn't definitively tell whether Xapian or Solr support this. Thoughts?

Thanks for any suggestion. I'm going to continue digging into this myself for the time being.


Unfortunately I don't know enough to answer your other questions, however for point 3.) Whoosh actually does support this.

You would have to use the autocomplete function of SearchQuerySet.

Detailed here: http://docs.haystacksearch.org/dev/autocomplete.html

I'm currently using Whoosh and matching on partial matches myself.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜