开发者

Exclude strings within parentheses from a regular expression?

I'm looking to split space-delimited strings into a series of search terms. However, in doing so I'd like to ignore spaces within parentheses. For example, I'd like to be able to split the string

a, b, c, search:(1, 2, 3), d

into 开发者_开发技巧

[[a] [b] [c] [search:(1, 2, 3)] [d]]

Does anyone know how to do this using regular expressions in Java?

Thanks!


This isn't a full regex, but it'll get you there:

(\([^)]*\)|\S)*

This uses a common trick, treating one long string of characters as if it were a single character. On the right side we match non-whitespace characters with \S. On the left side we match a balanced set of parentheses with anything in between.

The end result is that a balanced set of parentheses is treated as if it were a single character, and so the regex as a whole matches a single word, where a word can contain these parenthesized groups.

(Note that because this is a regular expression it can't handle nested parentheses. One set of parentheses is the limit.)


This problem had another solution that wasn't mentioned, so I'll post it here for completion. This situation is similar to this question to ["regex-match a pattern, excluding..."][4]

We can solve this with a beautifully-simple regex:

\([^)]*\)|(\s*,\s*)

The left side of the alternation | matches complete (parentheses). We will ignore these matches. The right side matches and captures commas and surrounding spaces to Group 1, and we know they are the right apostrophes because they were not matched by the expression on the left. We will replace these commas by something distinctive, then split.

This program shows how to use the regex (see the results at the bottom of the online demo):

import java.util.*;
import java.io.*;
import java.util.regex.*;
import java.util.List;

class Program {
public static void main (String[] args) throws java.lang.Exception  {

String subject = "a, b, c, search:(1, 2, 3), d";
Pattern regex = Pattern.compile("\\([^)]*\\)|(\\s*,\\s*)");
Matcher m = regex.matcher(subject);
StringBuffer b= new StringBuffer();
while (m.find()) {
if(m.group(1) != null) m.appendReplacement(b, "SplitHere");
else m.appendReplacement(b, m.group(0));
}
m.appendTail(b);
String replaced = b.toString();
String[] splits = replaced.split("SplitHere");
for (String split : splits) System.out.println(split);
} // end main
} // end Program

Reference

How to match (or replace) a pattern except in situations s1, s2, s3...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜