How to get good perfomance of Regex in java
Below is example of text:
String id = "A:abc,X:def,F:xyz,A:jkl";
Below is regex:
Pattern p = Pattern.compile("(.*,)?[AC]:[^:]+$");
if(p.matcher(id).matches()) {
System.out.println("Hello world!")
}
When executed above code should print Hello world!.
Does this regex can be modified to 开发者_开发百科gain more performance?
As I can't see your entire code, I can only assume that you do the pattern compilation inside your loop/method/etc. One thing that can improve performance is to compile at the class level and not recompile the pattern each time. Other than that, I don't see much else that you could change.
Pattern p = Pattern.compile(".*[AC]:[^:]+$");
if(p.matcher(id).matches()) {
System.out.println("Hello world!")
}
As you seem to only be interested if it the string ends in A or C followed by a colon and some characters which aren't colons you can just use .*
instead of (.*,)?
(or do you really want to capture the stuff before the last piece?)
If the stuff after the colon is all lower case you could even do
Pattern p = Pattern.compile(".*[AC]:[a-z]+$");
And if you are going to match this multiple times in a row (e.g. loop) be sure to compile the pattern outside of the loop.
e,g
Pattern p = Pattern.compile(".*[AC]:[a-z]+$");
Matcher m = p.matcher(id);
while(....) {
...
// m.matches()
...
// prepare for next loop m.reset(newvaluetocheck);
}
Move Pattern
instantiation to a final static field (erm, constant), in your current code you're recompiling essentially the same Pattern
every single time (no, Pattern
doesn't cache anything!). That should give you some noticeable performance boost right off the bat.
Do you even need to use regualr expressions? It seems there isn't a huge variety in what you are testing.
If you need to use the regex as others have said, compiling it only once makes sense and if you only need to check the last token maybe you could simplify the regex to: [AC]:[^:]{3}$
.
Could you possibly use something along these lines (untested...)?
private boolean isId(String id)
{
char[] chars = id.toCharArray();
boolean valid = false;
int length = chars.length;
if (length >= 5 && chars[length - 4] == ':')
{
char fifthToLast = chars[length - 5];
if (fifthToLast == 'A' || fifthToLast == 'C')
{
valid = true;
for (int i = length - 1; i >= length - 4; i--)
{
if (chars[i] == ':')
{
valid = false;
break;
}
}
}
}
return valid;
}
精彩评论