开发者

Contextual Regular Expression

I have a list of comma separated words that I want to remove the comma from and replace with a space:

elements-(a,b开发者_如何学JAVA,c,d)

becomes:

elements-(a b c d)

The question is how can I do this using a regular expression if and only if that list is within a specific context, e.g. only prefixed by element-():

The following:

There are a number of elements-(a,b,c,d) and a number of other elements-(e,f,g,h)

should become:

There are a number of elements-(a b c d) and a number of other elements-(e f g h)

What would be the correct way to do this with regex?


For contextual regular expressions, you can use zero-width look-around assertions. Look-around assertions are used to assert that something must be true in order for the match to succeed, but they do not consume any characters (hence "zero-width").

In your case, you want to use positive look-behind and look-ahead assertions. In C#, you can do the following:

    static string Replace(string text)
    {
        return Regex.Replace(
            text,
            @"(?<=elements\-\((\w+,)*)(\w+),(?=(\w+,)*\w+\))",
            "$2 "
        );
    }

There are three basic parts to the pattern here (in order):

  1. (?<=elements\-\((\w+,)*) - this is the positive look-behind assertion. It says that the pattern will only match if it is preceded by the text elements-( and zero-or-more comma-separated strings.
  2. (\w+), - this is the actual match. It's the text that's being replaced.
  3. (?=(\w+,)*\w+\)) - this is the positive look-ahead assertion. It says that the pattern will only match if it is followed by one-or-more comma-separated strings.

In C#, for matching the inner comma-separated contents, you can alternatively do the following:

    static string Replace(string text)
    {
        return Regex.Replace(
            text,
            @"(?<=elements\-)\(((\w+,)+\w+)\)",
            m => string.Format("({0})", m.Groups[1].Value.Replace(',', ' '))
        );
    }

The basic approach with the positive look-ahead assertion is still the same.

Example output:

"(x,y,z) elements-(a,b) (m,m,m) elements-(c,d,e,f,g,h)"

...becomes...

"(x,y,z) elements-(a b) (m,m,m) elements-(c d e f g h)"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜