Fast Levenshtein distance in R?
Is there a package that conta开发者_运维技巧ins Levenshtein distance counting function which is implemented as a C or Fortran code? I have many strings to compare and stringMatch from MiscPsycho is too slow for this.
And stringdist in the stringdist package does it too, even faster than levenshteinDist under certain conditions (1)
levenshteinDist (from the RecordLinkage package) calls compiled C code. Give it a try.
You could try stringDist from Biostrings as well
You could also use levenshtein_distance() from the textTinyR package. I got 'calloc' memory errors with all other packages when it came to larger character vectors of around 30k characters. Only textTinyR worked for me!
加载中,请稍侯......
精彩评论