开发者

Free text (natural language) query parsing with solr

I'm trying to build a query parsing algorithm 开发者_如何学Pythonfor a local search site that can classify a free text search query (single input text box) into various type of possible searches possible on the site.

For e.g. the user could type chinese restaurants near xyz. How should I go about breaking it down to Cuisine:"chinese", locality:"xyz" given that

- there could be spelling mistakes
- keywords may match in different columns e.g. a restaurant may have "chinese" in its name

This is not really a natural language parsing problem since we're trying to search in a very limited set of posiibilities

My initial thoughts are to dump all values of a particular type into a field from the database and use the users query to match in all those fields. Then based on the score (and a predifined confidence level) divide the query into the 3-4 search fields like name/cuisine/locality.

Is there a better/standard way of doing this.


About spelling mistakes, you have to work with a dictionary/thesaurus. This can be part of your pre-processing and normalization.

About querying in multiple columns you can do; cuisine:chinese OR restaurant_name:chinese

You can boost one of the two: cuisine:chinese^0.8 OR restaurant_name:chinese

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜