Will the Levenshtein distance algorithm work well for non-English language strings too? Up开发者_如何学Godate: Would this work automatically in a language like Java when comparing Asian characters?O
I\'m using django-haystack at the moment with apache-solr as the backend. Problem is I cannot get the app to perform the search functionality I\'m looking for
I have a database of ~150\'000 words and a pattern (any single word) and I want to get all words from the database which has Damerau-Levenshtein distance between it and the pattern less than given num
We\'re looking to provide a fuzzy search on an electrical materials database (i.e. conduit, cable, etc.).The problem is that, because of a lack of consistency across all material types, we could not s
I am trying to use Lucene to search for names in a database. However, some of the names contain words like \"NOT\" and \"OR\" and even \"-\" minus symbols. I still want the different tokens inside the
I have a site that needs to search thru about 20-30k records, which are mostly movie and TV show names. The site runs php/mysql with memcache.
I want to find possible candidate duplicate records in a large database matching on fields like COMPANYNAME and ADDRESSLINE1
I\'m wondering if there is some kind of way to do fuzzy string matching in PHP. Looking for a word in a long string, finding a potential match even if its mis-spelled; something that would find it if