Suggesting other queries for archaic spellings (eg Google's Did You Mean)
My client has a database of real estate records spanning 400 years. They are interested to have alternate spelling suggestions made to users based o开发者_运维百科n their data.
I'm assuming in a case like this it would be down to a table that has a rows for "Martin" and suggestions of "Martyn" and "Martine" etc.
Does anyone know of a third party solution which would index their data?
EDIT
Per magicmike's answer SOUNDEX()
rocks this!
SELECT `Last` FROM `Names`
WHERE SOUNDEX(`Last`) RLIKE SOUNDEX ('Martin')
GROUP BY `Last`
yields this nice little list:
Martens
Martenstein
Martin
Martina
Martindale
Martine
Martineau
Martines
Martinet
Martinez
Martini
Martino
Martinstein
Mertens
Miradona
Moret and Marchand
Mortimer
Morton
Google's "did you mean" is pretty interesting: How does the Google "Did you mean?" Algorithm work?
In the past, I've managed to implement something similar with SOUNDEX which can approximate that functionality.
Martin, Martyn, and Martine yield the same output from SOUNDEX
.
You could include all the results from the SOUNDEX
match, or select the results they asked for and then select distinct name from table where SOUNDEX(name) = SOUNDEX(search_var)
as your 'suggestions'.
As an optimization, you can pre-calculate SOUNDEX on search fields and persist that as an indexed column to avoid table scans.
It's not as sophisticated as Google's Did You Mean, but you can get reasonably close very quickly.
Index the data? You can take a look at Apache Lucene
精彩评论