开发者

String/Sequence Pattern Mining

it's a week i'm trying to find an answer for my question , i would appreciate if anyone can help . I've got a list of strings(original开发者_运维问答ly list of sequences which can be viewed as list of strings) and i'd like to find a pattern (which is a string itself) withtin strings of this list , is there any java library which can i use or is there any tool (like weka , which doesn't do this!) which can help me ??


Sounds like you want to find the longest common subsequence of those strings. This is a well known algorithmic problem that is commonly solved using dynamic programming. See here for various implementations in multiple languages.


If you want to find patterns frequently occuring in a set of sequence, then you could try "sequential pattern mining" or "sequential rule mining algorithms".

There are several implementations of these algorithms in my SPMF Java open-source data mining library.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜