开发者

efficient longest common subsequence algorithm library?

I'm looking for a (space) efficient implementation of an LCS algorithm for use in a C++ program. Inputs are two random access sequences of integers.

I'm currently using the dynamic programming approach from th开发者_StackOverflow社区e wikipedia page about LCS. However, that has O(mn) behaviour in memory and time and dies on me with out of memory errors for larger inputs.

I have read about Hirschberg's algorithm, which improves memory usage considerably, Hunt-Szymanski and Masek and Paterson. Since it isn't trivial to implement these I'd prefer to try them on my data with an existing implementation. Does anyone know of such a library? I'd imagine since text diff tools are pretty common, there ought to be some open source libraries around?


When searching for things like that, try scholar.google.com. It is much better for finding scholarly works. It turned up http://www.biotec.icb.ufmg.br/cabi/artigos/seminarios2/subsequence_algorithm.pdf this document, a "survey of longest common subsequences algorithms".


Not C++ but Python but I think usable.

http://wordaligned.org/articles/longest-common-subsequence


Hirschberg's Algorithm embeds a javascript implementation : almost C.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜