开发者

Java regular expression longest match

I am having a problem with a generic regex that matches (sort of) a typical string of the form

... "field1" "field2" "field3" "fi开发者_JS百科eld4" ...

What I want to do is, of course, get each of these fields separately. Because the fields can contain any character, I am using a "catch-all" regex of the form

... \"(.*?)\" +\"(.*?)\" +\"(.*?)\" +\"(.*?)\" + ...

The problem is, instead of producing 4 different groups, Java gives me just one, which is merges those 4 above, i.e. I get a single field:

field1" "field2" "field3" "field4

instead of

field1
field2
field3
field4

I have even tried doing things like \"([^\"]*)\" for each of the fields, but the result is the same.

How could I get these 4 fields separately?


You may try String.split method for such inputs.

    String input = "... \"field1\" \"field2\" \"field3\" \"field4\" ...";
    String[] split = input.split("\"\\s*\"?");
    String field1 = split[1];  // field1
    String field2 = split[2];  // field2
    String field3 = split[3];  // field3
    String field4 = split[4];  // field4


Each call to matcher.find() will move to the next match:

String input = "... \"field1\" \"field2\" \"field3\" \"field4\" ...";
Matcher matcher = Pattern.compile("\"(.*?)\"").matcher(input);
while (matcher.find())
    System.out.println(matcher.group(1));

or, if you really want to capture all four in one match:

Matcher matcher = Pattern.compile("\"(.*?)\".*?\"(.*?)\".*?\"(.*?)\".*?\"(.*?)\".*?").matcher(input);
if (matcher.find()) {
    System.out.println(matcher.group(1));
    System.out.println(matcher.group(2));
    System.out.println(matcher.group(3));
    System.out.println(matcher.group(4));
}

Both produce the same output, which is:

field1
field2
field3
field4


Are you calling matcher.group(1), matcher.group(2), etc to get the individual matches? The default method returns the whole match which is all the fields.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜