Fuzzy Matching with threshold filter C#
I need to implement some kind of this:
string textToSearch = "Extreme Golf: The Showdown";
string textToSearchFor = "Golf Extreme Showdown";
int fuzzyMatchScoreThreshold = 80;开发者_如何学运维 // One a 0 to 100 scale
bool searchSuccessful = IsFuzzyMatch(textToSearch, textToSearchFor, fuzzyMatchScoreThreshold);
if (searchSuccessful == true)
{
-- we have a match.
}
Here's the function stub written in C#:
public bool IsFuzzyMatch (string textToSearch, string textToSearchFor, int fuzzyMatchScoreThreshold)
{
bool isMatch = false;
// do fuzzy logic here and set isMatch to true if successful match.
return isMatch;
}
But I have no any idea how to implement logic in IsFuzzyMatch method. Any ideas? Perhaps there is a ready-made solution for this purpose?
I like a combination of Dice Coeffiecient, Levenshtein Distance, Longest Common Subsequence, and at times the Double Metaphone. The first three will provide you a threshold value. I prefer to combine them in some way. YMMV.
I've just posted a blog post that has a C# implementation for each of these called Four Functions for Finding Fuzzy String Matches in C# Extensions.
You need Levenshtein Distance Algorithm for find how to go from one string to another by operations insert, delete and modify. You fuzzyMatchScoreThreshold is a Levenshtein Distance divided to length of the string in simple way.
精彩评论