Do RE2-like regular expression library for Java exist?
Did anyone come across Java version of Google's regular expression library RE2 or a java library with similar capabilities and good performance? The performance requirement is linear time with regard to the length of regular expression and the input text length.
Clarification
Most regular expression implementation use a backtracking algorithm to match the input text and hence are exponential on some simple regular expressions like (.*).(.*).(.*).(.*)
. RE2 is a library from google that solves this problem by using an algorithm that varies linearly with input size using the concepts of Automata theory. The quest开发者_如何学Goioner wants to know whether there exists libraries for Java that are based on this algorithm.
Google today released a pure-Java port of Go's RE2 implementation. You can find it here:
https://github.com/google/re2j
There is a finite-state automata package for Java here: www.brics.dk/automaton; also see this article. Here is a simple example:
RegExp r = new RegExp("ab(c|d)*");
Automaton a = r.toAutomaton();
String s = "abcccdc";
System.out.println("Match: " + a.run(s)); // prints: true
Google search yielded this.
https://github.com/logentries/re2-java
it says it only supports linux 64 bit.
Edit: I believe a better answer is now available, as answered by Alan Donovan, since Google themselves have released a port of RE2 https://github.com/google/re2j
精彩评论