开发者

Java regex from Perl-type regex

I'm trying to extract hours, minutes, seconds, and nanoseconds from a string time stamp in a log file.Here is the input string I am testing with:

 SOME_TEXT,+09:30:01.040910105,SOME_TEXT,SOME_TEXT,SOME_TEXT

In Perl/Python, I would use the following regex to group the fields I am interested in:

 (\d\d)\:(\d\d)\:(\d\d)\.(\d{9})

You can verify that the regex works with the test string at http://regexpal.com if you're curious.

So I tried to write a simple Java program that can extract the fields:

import java.util.regex.*;

public class Driver
{
  static public void main(String[] args)
  {
    String t = new String("SOME_TEXT,+09:30:01.040910105,SOME_TEXT,SOME_TEXT,SOME_TEXT");
    Pattern regex = Pattern.compile("(\\d\\d):(\\d\\d):(\\d\\d)\\.(\\d{9})");
    Matcher matches = regex.matcher(t);
    for (int i=1; i<matches.groupCount(); ++i)
    {
      System.out.println(matches.group(i));
    }
  }
}

My regex did not translate correctly, however. The following exception shows that it did not find any matches:

 Exception in thread "main" java.lang.IllegalStateException: No match found
   at java.util.regex.Matcher.group(Matcher.java:485)
   at Driver.main(Driver.java:12)

How would I properly translate the regex fr开发者_StackOverflowom Perl/Python style to Java?


The regex itself is fine. There are, however, two problems with the code:

  1. you need to call Matcher.find();
  2. you need to fix the for loop (it should use <= instead of <).

Here is the corrected version:

public class Driver
{
  static public void main(String[] args)
  {
    String t = new String("SOME_TEXT,+09:30:01.040910105,SOME_TEXT,SOME_TEXT,SOME_TEXT");
    Pattern regex = Pattern.compile("(\\d\\d):(\\d\\d):(\\d\\d)\\.(\\d{9})");
    Matcher matcher = regex.matcher(t);
    while (matcher.find()) {
        for (int i=1; i<=matcher.groupCount(); ++i)
        {
          System.out.println(matcher.group(i));
        }
    }
  }
}

This prints out:

09
30
01
040910105


Java breaks the perl-style, introducing complexity where it need not be. If you want to do regular expressions in Java the right way, take a look on MentaRegex. Below some examples:

The method matches returns a boolean saying whether we have a regex match or not.

matches("Sergio Oliveira Jr.", "/oliveira/i" ) => true

The method match returns an array with the groups matched. So it not only tells you whether you have a match or not but it also returns the groups matched in case you have a match.

match("aa11bb22", "/(\\d+)/g" ) => ["11", "22"]

The method sub allows you perform substitutions with regex.

sub("aa11bb22", "s/\\d+/00/g" ) => "aa00bb00"

Support global and case-insensitive regex.

match("aa11bb22", "/(\\d+)/" ) => ["11"]
match("aa11bb22", "/(\\d+)/g" ) => ["11", "22"]
matches("Sergio Oliveira Jr.", "/oliveira/" ) => false
matches("Sergio Oliveira Jr.", "/oliveira/i" ) => true

Allows you to change the escape character in case you don't like to see so many '\'.

match("aa11bb22", "/(\\d+)/g" ) => ["11", "22"]
match("aa11bb22", "/(#d+)/g", '#' ) => ["11", "22"]


By default java regexs match against the whole string, you have to add .* to the beginning and end:

Pattern regex = Pattern.compile(".*(\\d\\d):(\\d\\d):(\\d\\d)\\.(\\d{9}).*");

and that should work, with the other corrections to your for loop as necessary :-)


oh,no! I copied your codes and wrapped with if (matches.find()) { ...} then worked. you need this.

and nanoseconds was missing. you should do this change:

for (int i = 1; i <= matches.groupCount(); ++i)
-------------------^
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜