Fuzzy matching API in a long list of queries
I have an application which lets people ask predefined queries. However, the list of such queries is too long. Hence, the current approach is to let users enter a word in the search box and then show them the likely matches from the list of queries. ( Very much like google's "Did you mean" feature.)
Is there an API in Java available for this? I should be able to supply the list of queries. The API should provide a fuzzy match capabilit开发者_如何学Cy, so that incorrect spellings do not matter. ( That is why an exact String matching algorithm is not sufficient)
The magic word here may be "regular expression" -- anything you can model as a finite state machine can be done with regular expressions.
Failing that, you might look into "digital search trees" or "tries".
Some of the API's i can suggest are:
- Patricia Trie
- Trie
Similar SO Questions:
- How Does Google "Did you mean" algorithm work ?
- Where can I learn about google did you mean ?
Perhaps a probabilistic algorithm using Soundex or a derivative would work? http://en.wikipedia.org/wiki/Soundex
Found these Java implementation of Peter Norvig's spell correction algorithm. A bit dated, but good for getting started.
- Spelling Corrector
- jSpellCorrect
精彩评论