开发者

splitting a string bases upon a pattern

I have a string with patter (ab)(bc)(ca) or abc. Now if () is present then I need to do insertion as follows:

开发者_如何学Go
pattern (ab)(bc)(ca)  OP A=ab B= bc C= ca
pattern abc           OP A=a  B=b   C=c
parrtern (abc)b c     OP A=abc B=b  c= c
parrtern a (bb) c     OP A=abc B=bb  c= c

How can I use regular expressions to split string like this?


You can use Guava's Splitter class. It can split by many different things.

(Or so I thought until the Question was updated with more info)


Arg, now you added info, and I don't think any Split method will get you there. This will, however:

String s = " (abc)b c";
Matcher matcher = Pattern.compile("(?<=\\()[a-z]{2,}(?=\\))|[a-z]").matcher(s);
while (matcher.find()){
    System.out.println(matcher.group());
}

Now if you need the items in an array or Collection, just replace the System.out.println() call with something more sensible.

Output:

abc
b
c

The Pattern explained:

(?<=\\()  // match after an opening parenthesis
[a-z]{2,} // match two or more letters
(?=\\))   // match before closing parenthesis
|         // or
[a-z]     // match a single letter


Check out String.split(..);.


Here's one approach. Doesn't really "split it" in one go, but this is probably what I would have done.

String[] tests = {"(ab)(bc)(ca)", "abc", "(abc)b c", "a (bb) c" };

Pattern p = Pattern.compile("\\s*(\\(.*?\\)|.)\\s*");

for (String test : tests) {
    Matcher m = p.matcher(test);

    System.out.println("Test: " + test);
    while (m.find())
        System.out.println("   Part: " + m.group().replaceAll("[() ]", ""));

    System.out.println();
}

Output:

Test: (ab)(bc)(ca)
   Part: ab
   Part: bc
   Part: ca

Test: abc
   Part: a
   Part: b
   Part: c

Test: (abc)b c
   Part: abc
   Part: b
   Part: c

Test: a (bb) c
   Part: a
   Part: bb
   Part: c

Something like this may even do (I may have exploited a property of your example that is not present in your "real" problem. I hate when people do this with my questions, so I apologize in advance if this is the case!):

String[] tests = {"(ab)(bc)(ca)", "abc", "(abc)b c", "a (bb) c" };

for (String test : tests) {

    String[] parts = test.length() == 3
        ? test.split("(?<=.)")
        : test.replaceAll("[()]", " ").trim().split("\\s+");

    System.out.printf("Test: %-16s   Parts: %s%n", test, Arrays.toString(parts));
}

Output:

Test: (ab)(bc)(ca)       Parts: [ab, bc, ca]
Test: abc                Parts: [a, b, c]
Test: (abc)b c           Parts: [abc, b, c]
Test: a (bb) c           Parts: [a, bb, c]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜