开发者

How to get good perfomance of Regex in java

Below is example of text:

String id = "A:abc,X:def,F:xyz,A:jkl";

Below is regex:

Pattern p = Pattern.compile("(.*,)?[AC]:[^:]+$");
if(p.matcher(id).matches()) {
  System.out.println("Hello world!")
}

When executed above code should print Hello world!.

Does this regex can be modified to 开发者_开发百科gain more performance?


As I can't see your entire code, I can only assume that you do the pattern compilation inside your loop/method/etc. One thing that can improve performance is to compile at the class level and not recompile the pattern each time. Other than that, I don't see much else that you could change.


Pattern p = Pattern.compile(".*[AC]:[^:]+$");
if(p.matcher(id).matches()) {
  System.out.println("Hello world!")
}

As you seem to only be interested if it the string ends in A or C followed by a colon and some characters which aren't colons you can just use .* instead of (.*,)? (or do you really want to capture the stuff before the last piece?)

If the stuff after the colon is all lower case you could even do

Pattern p = Pattern.compile(".*[AC]:[a-z]+$");

And if you are going to match this multiple times in a row (e.g. loop) be sure to compile the pattern outside of the loop.

e,g

Pattern p = Pattern.compile(".*[AC]:[a-z]+$");
Matcher m = p.matcher(id);
while(....) {
  ...
  // m.matches()
  ...
  // prepare for next loop m.reset(newvaluetocheck);
}


Move Pattern instantiation to a final static field (erm, constant), in your current code you're recompiling essentially the same Pattern every single time (no, Pattern doesn't cache anything!). That should give you some noticeable performance boost right off the bat.


Do you even need to use regualr expressions? It seems there isn't a huge variety in what you are testing.

If you need to use the regex as others have said, compiling it only once makes sense and if you only need to check the last token maybe you could simplify the regex to: [AC]:[^:]{3}$.

Could you possibly use something along these lines (untested...)?

private boolean isId(String id)
    {
        char[] chars = id.toCharArray();
        boolean valid = false;
        int length = chars.length;

        if (length >= 5 && chars[length - 4] == ':')
        {
            char fifthToLast = chars[length - 5];

            if (fifthToLast == 'A' || fifthToLast == 'C')
            {
                valid = true;

                for (int i = length - 1; i >= length - 4; i--)
                {
                    if (chars[i] == ':')
                    {
                        valid = false;
                        break;
                    }
                }
            }
        }

        return valid;
    }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜