开发者

Find out how much percent one string contains in another

I need to find out how much percentage or chars does one string contains into another string. I've tried Levenshtein Distance but that algorithm returns how much char's are needed to be change for the strings to be equal. Can some one help? I need it in c# but tha开发者_如何学运维t's not so important.

The answer code: public double LongestCommonSubsequence(string s1, string s2) { //if either string is empty, the length must be 0 if (String.IsNullOrEmpty(s1) || String.IsNullOrEmpty(s2)) return 0;

    int[,] num = new int[s1.Length, s2.Length];  //2D array
    char letter1;
    char letter2;

    //Actual algorithm
    for (int i = 0; i < s1.Length; i++)
    {
        letter1 = s1[i];
        for (int j = 0; j < s2.Length; j++)
        {
            letter2 = s2[j];

            if (letter1 == letter2)
            {
                if ((i == 0) || (j == 0))
                    num[i, j] = 1;
                else
                    num[i, j] = 1 + num[i - 1, j - 1];
            }
            else
            {
                if ((i == 0) && (j == 0))
                    num[i, j] = 0;
                else if ((i == 0) && !(j == 0))   //First ith element
                    num[i, j] = Math.Max(0, num[i, j - 1]);
                else if (!(i == 0) && (j == 0))   //First jth element
                    num[i, j] = Math.Max(num[i - 1, j], 0);
                else // if (!(i == 0) && !(j == 0))
                    num[i, j] = Math.Max(num[i - 1, j], num[i, j - 1]);
            }
        }//end j
    }//end i
    return (s2.Length - (double)num[s1.Length - 1, s2.Length - 1]) / s1.Length * 100; 
} //end LongestCommonSubsequence


It sounds like you might want the longest common subsequence which is the basis for diff algorithms. Unfortunately this problem is NP-hard which means there is no efficient (polynomial time) solution. The Wikipedia page has some suggestions.


Uhh... can't you just use the number of characters that need to change then?

(length(destination)-changed_character_count)/ length(source)

EDIT: based on the revised question, treat both strings as sets, compute the set intersection, and base the percentage off the size of that set and the source string as a set.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜