find the longest sequence S that is a subsequence of A,B,C string
Give a polynomial time algorithm that takes three strings, A, B and C, as input, and returns the longest s开发者_开发问答equence S that is a subsequence of A, B, and C.
Let dp[i, j, k] = longest common subsequence of prefixes A[1..i], B[1..j], C[1..k]
We have:
dp[i, j, k] = dp[i - 1, j - 1, k - 1] + 1 if A[i] = B[j] = C[k]
max(dp[i - 1, j, k], dp[i, j - 1, k], dp[i, j, k - 1]) otherwise
Similar to the 2d case, except you have 3 dimensions. Complexity is O(len A * len B * len C)
.
Here's a solution in Python for an arbitrary number of sequences. You could use it to test your solution for 2D, 3D cases. It closely follows Wikipedia's algorithm:
#!/usr/bin/env python
import functools
from itertools import starmap
@memoize
def lcs(*seqs):
"""Find longest common subsequence of `seqs` sequences.
Complexity: O(len(seqs)*min(seqs, key=len)*reduce(mul,map(len,seqs)))
"""
if not all(seqs): return () # at least one sequence is empty
heads, tails = zip(*[(seq[0], seq[1:]) for seq in seqs])
if all(heads[0] == h for h in heads): # all seqs start with the same element
return (heads[0],) + lcs(*tails)
return max(starmap(lcs, (seqs[:i]+(tails[i],)+seqs[i+1:]
for i in xrange(len(seqs)))), key=len)
def memoize(func):
cache = {}
@functools.wraps(func)
def wrapper(*args):
try: return cache[args]
except KeyError:
r = cache[args] = func(*args)
return r
return wrapper
Note: without memoization it is an exponential algorithm (wolfram alpha):
$ RSolve[{a[n] == K a[n-1] + K, a[0] = K}, a[n], n]
a(n) = (K^(n + 1) - 1) K/(K - 1)
where K
== len(seqs)
and n
== max(map(len, seqs))
Examples
>>> lcs("agcat", "gac")
('g', 'a')
>>> lcs("banana", "atana")
('a', 'a', 'n', 'a')
>>> lcs("abc", "acb")
('a', 'c')
>>> lcs("XMJYAUZ", "MZJAWXU")
('M', 'J', 'A', 'U')
>>> lcs("XMJYAUZ")
('X', 'M', 'J', 'Y', 'A', 'U', 'Z')
>>> lcs("XMJYAUZ", "MZJAWXU", "AMBCJDEFAGHI")
('M', 'J', 'A')
>>> lcs("XMJYAUZ", "MZJAWXU", "AMBCJDEFAGUHI", "ZYXJAQRU")
('J', 'A', 'U')
>>> lcs() #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
ValueError:
>>> lcs(*"abecd acbed".split())
('a', 'b', 'e', 'd')
>>> lcs("acd", lcs("abecd", "acbed"))
('a', 'd')
>>> lcs(*"abecd acbed acd".split())
('a', 'c', 'd')
All you have to do is Google "longest subsequence".
This is the top link: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
If you have any particular problem understanding it then please ask here, preferably with a more specific question.
精彩评论