How to improve search results?
I'm using sphinx to find images by description, but I'm having trouble with short and too common words like 'the', which don't have an useful meaning for these searches. These words make the search results inaccurate. Is there any option to开发者_StackOverflow remove them all? In the past I used the text search from MySQL, which does this by default.
In case there isn't an option to do this, do could you give me a list of this kind of words in english? Not sure which ones I should include.
Note: I don't want to just remove all the words that are 3 chars long or shorter. Some images need to be described with words of 2 or 3 characters, eg: BMW car
MySQL supports stopwords for their full-text index (and provides a file in the source distribution): http://dev.mysql.com/doc/refman/5.5/en/fulltext-stopwords.html
Maybe that file is a good startingpoint.
精彩评论