开发者

The length of longest common subsequence of two strings

I am trying to write a function which computes the length of the longest common subsequence of the two input strings str1 and str2.

This is what I have right now,

(define LCS
  (lambda (str1 str2)
    (if (OR (equal? str1 "") (equal? str2 ""))
        0
        (if (equal? (string-contains str1 (string-ref str2 0)) #t) 
            (+ 1 (LCS (substring str1 1 (string-length str1)) 
                      (substring str2 1 (string-length str2))))
            (LCS (substring str1 1 (string-length str1)) 
                 (substring str2 1 (string-length str2)))))))

Where string-conta开发者_Go百科ins returns true if a string has a certain character in it. Right now it seems like it works, but I think there might be a bug.


Your code is totally on the right track if you don't mind a relatively slow algorithm; there's a more sophisticated approach to the problem, with dynamic programming, if you need it to be faster.

Right now, the bug in your code is that you're moving down the length of both strings simultaneously with each recursive call. It would fail, for example (I think, I haven't tried it) with the following two strings: (LCS "scheme" "emesch") Reason being is that the matching substrings don't start at the same position in the first and second string.

I suggest that you split up the recursion into two recursive calls: one where you remove a character off the front of only the first string, and one where you remove a character off the front of only the second. Then, you take the best result from either of those calls. In that fashion, you can be sure that you'll find substrings no matter where they lie in the other string.


The two strings are scanned in parallel. If the current two characters are identical, the length of the longest common subsequence increases by one, and the scan continues at the next character in each string; otherwise, there are two possibilities to consider recursively, after deleting the current character from either one input sequence or the other, and the length of the longest common subsequence is simply the greater of the two possibilities.

A more complete explanation and an implementation in Scheme are available at my blog.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜