开发者

Algorithm / Library for measuring degree of equality of strings

Is there an algorithm that given two strings yields the degree of equality between them, applying metrics that can be provided externally? For example开发者_运维百科, the two strings "Plant code" and "PlantCode" could be 0.8 equal, "Plant code" and "Plant" could be 0.6 equal, "Truck no" and "shipment details" could be 0.6 equal (using extrenally provided synonyms dictionary). The numbers are made up, but I hope they get the point across. Does there exist such an algorithm? I'd prefer if it comes as a library, rather than having to implement it on my own. Any help would be greatly appreciated. Thanks.


Try the Simmetrics library. It provides a whole number of simmilarity metrics.


Maybe the google-diff-match-patch library can help: This library implements Myer's diff algorithm which is generally considered to be the best general-purpose diff.


There's also Levenshtein distance algorithm and its example java implementation. It does not make it possible to provide an external metrics, though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜