Is there a faster (less precise) algorithm than Levenshtein for string distance?
I want to run the Levenshtein, but WAY faster because it's real 开发者_如何转开发time application that I'm building. It can terminate once the distance is greater than 10.
Judging from comments, people seem to be pretty happy with Sift3.
http://sift.codeplex.com
The Levenshtein distance metric allows addition, deletion or substitution operations. If you're looking for a faster but less precise metric you can use the longest common subsequence (allows only addition and deletion) or even the Hamming distance (allows only substitution).
However, I recommend that you try to optimize your Levenshtein distance algorithm instead as it gives the best results.
If you want to compare UTF-8 contents use sift4
:
https://siderite.dev/blog/super-fast-and-accurate-string-distance.html
Also I prepared a jsPerf which shows the performance difference between those libraries: http://jsperf.com/levenshtein-perf
精彩评论