What are the zero width elements in a regular expression?
Recently, I have been seeing "zero width elements" in regular expressions. What are they? Can they be treated as ghost data, so that for replacement, they won't be replaced, and for ( )
matching, they won't go into the matches[1]
, matches[2]
, etc?
Is there a good tutorial for all its various uses? Have they been here for a long time? Wh开发者_如何学JAVAich version of O'Reilly's Regular Expression book was the first to discuss them?
The point of zero-width lookaround assertions is that they check if a certain regex can or cannot be matched looking forward or backwards from the current position, without actually adding them to the match. So, yes, they won't count towards the capturing groups, and yes, their matches won't be replaced (because they aren't matched in the first place).
However, you can have a capturing group inside a lookaround assertion that will go into matches[1]
etc.
For example, in C#:
Regex.Replace("ab", "(a)(?=(b))", "$1$2");
will return abb
.
A very good online tutorial about regular expressions in general can be found at http://www.regular-expressions.info (even though it's a little out of date in some areas).
It contains a specific section about zero-width lookaround assertions (and Part II).
And of course they are covered in-depth in both Mastering Regular Expressions and the Regular Expressions Cookbook.
精彩评论