Java regex from Perl-type regex
I'm trying to extract hours, minutes, seconds, and nanoseconds from a string time stamp in a log file.Here is the input string I am testing with:
SOME_TEXT,+09:30:01.040910105,SOME_TEXT,SOME_TEXT,SOME_TEXT
In Perl/Python, I would use the following regex to group the fields I am interested in:
(\d\d)\:(\d\d)\:(\d\d)\.(\d{9})
You can verify that the regex works with the test string at http://regexpal.com if you're curious.
So I tried to write a simple Java program that can extract the fields:
import java.util.regex.*;
public class Driver
{
static public void main(String[] args)
{
String t = new String("SOME_TEXT,+09:30:01.040910105,SOME_TEXT,SOME_TEXT,SOME_TEXT");
Pattern regex = Pattern.compile("(\\d\\d):(\\d\\d):(\\d\\d)\\.(\\d{9})");
Matcher matches = regex.matcher(t);
for (int i=1; i<matches.groupCount(); ++i)
{
System.out.println(matches.group(i));
}
}
}
My regex did not translate correctly, however. The following exception shows that it did not find any matches:
Exception in thread "main" java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.group(Matcher.java:485)
at Driver.main(Driver.java:12)
How would I properly translate the regex fr开发者_StackOverflowom Perl/Python style to Java?
The regex itself is fine. There are, however, two problems with the code:
- you need to call
Matcher.find()
; - you need to fix the
for
loop (it should use<=
instead of<
).
Here is the corrected version:
public class Driver
{
static public void main(String[] args)
{
String t = new String("SOME_TEXT,+09:30:01.040910105,SOME_TEXT,SOME_TEXT,SOME_TEXT");
Pattern regex = Pattern.compile("(\\d\\d):(\\d\\d):(\\d\\d)\\.(\\d{9})");
Matcher matcher = regex.matcher(t);
while (matcher.find()) {
for (int i=1; i<=matcher.groupCount(); ++i)
{
System.out.println(matcher.group(i));
}
}
}
}
This prints out:
09
30
01
040910105
Java breaks the perl-style, introducing complexity where it need not be. If you want to do regular expressions in Java the right way, take a look on MentaRegex. Below some examples:
The method matches returns a boolean saying whether we have a regex match or not.
matches("Sergio Oliveira Jr.", "/oliveira/i" ) => true
The method match returns an array with the groups matched. So it not only tells you whether you have a match or not but it also returns the groups matched in case you have a match.
match("aa11bb22", "/(\\d+)/g" ) => ["11", "22"]
The method sub allows you perform substitutions with regex.
sub("aa11bb22", "s/\\d+/00/g" ) => "aa00bb00"
Support global and case-insensitive regex.
match("aa11bb22", "/(\\d+)/" ) => ["11"]
match("aa11bb22", "/(\\d+)/g" ) => ["11", "22"]
matches("Sergio Oliveira Jr.", "/oliveira/" ) => false
matches("Sergio Oliveira Jr.", "/oliveira/i" ) => true
Allows you to change the escape character in case you don't like to see so many '\'.
match("aa11bb22", "/(\\d+)/g" ) => ["11", "22"]
match("aa11bb22", "/(#d+)/g", '#' ) => ["11", "22"]
By default java regexs match against the whole string, you have to add .* to the beginning and end:
Pattern regex = Pattern.compile(".*(\\d\\d):(\\d\\d):(\\d\\d)\\.(\\d{9}).*");
and that should work, with the other corrections to your for loop as necessary :-)
oh,no!
I copied your codes and wrapped with if (matches.find()) { ...}
then worked. you need this.
and nanoseconds was missing. you should do this change:
for (int i = 1; i <= matches.groupCount(); ++i)
-------------------^
精彩评论