Can you use zero-width matching regex in String split?
System.out.println(
Arrays.deepToString(
"abc<def>ghi".split("(?:<)|(?:>)")
)
);
This prints [abc, def, ghi]
, as if I had split on "<|>"
. I want it to print [abc, <def>, ghi]
. Is there a way to wo开发者_如何学编程rk some regex magic to accomplish what I want here?
Perhaps a simpler example:
System.out.println(
Arrays.deepToString(
"Hello! Oh my!! Good bye!!".split("(?:!+)")
)
);
This prints [Hello, Oh my, Good bye]
. I want it to print [Hello!, Oh my!!, Good bye!!]
.
`.
You need to take a look at zero width matching constructs:
(?=X) X, via zero-width positive lookahead
(?!X) X, via zero-width negative lookahead
(?<=X) X, via zero-width positive lookbehind
(?<!X) X, via zero-width negative lookbehind
You can use \b
(word boundary) as what to look for as it is zero-width and use that as the anchor for looking for <
and >
.
String s = "abc<def>ghi";
String[] bits = s.split("(?<=>)\\b|\\b(?=<)");
for (String bit : bits) {
System.out.println(bit);
}
Output:
abc
<def>
ghi
Now that isn't a general solution. You will probably need to write a custom split method for that.
Your second example suggests it's not really split()
you're after but a regex matching loop. For example:
String s = "Hello! Oh my!! Good bye!!";
Pattern p = Pattern.compile("(.*?!+)\\s*");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println("[" + m.group(1) + "]");
}
Output:
[Hello!]
[Oh my!!]
[Good bye!!]
Thanks to information from Cine, I think these are the answers I'm looking for:
System.out.println(
Arrays.deepToString(
"abc<def>ghi<x><x>".split("(?=<)|(?<=>)")
)
); // [abc, <def>, ghi, <x>, <x>]
System.out.println(
Arrays.deepToString(
"Hello! Oh my!! Good bye!! IT WORKS!!!".split("(?<=!++)")
)
); // [Hello!, Oh my!!, Good bye!!, IT WORKS!!!]
Now, the second one was honestly discovered by experimenting with all the different quantifiers. Neither greedy nor reluctant work, but possessive does.
I'm still not sure why.
精彩评论