Recursive regex in Perl
I've got strings like these:
| Released = {{start-date|June 14, 1972}}
| Released = {{Start date|1973|03|01|df=y}}
I'd like to replace all | within {{ }} with ^
| Released = {{start-date^June 14, 1972}}
| Released = {{Start date^1973^03^01^df=y}}
I can't use substring replacement because there are | symbols outside {{ }}, which must 开发者_Go百科be left intact. And because I don't know exactly how many parts does the string in {{ }} have, I can't use something like s/{{(.+?)\|(.+?)}}/{{$1^$2}}/
.
I suppose I need to use some kind of recursion here?
A simple solution:
s/\|(?=[^{}\n]*}})/^/g
Even simpler solution, but probably broken in many cases:
s/(?!^)\|/^/gm
Here is a bit more robust regex:
s/(?:\G(?!^)(?:(?>[^|]*?}})(?>.*?{{))*|^(?>.*?{{))(?>[^|]*?(?=}}|\|))\K\|(?=.*?}})/^/gs;
Commented:
s/
(?:
\G(?!^) # inside of a {{}} tag
(?: (?>[^|]*?}}) (?>.*?{{) )* # read till we find a | in another tag if none in current
|
^(?>.*?{{) # outside of tag, parse till in
)
(?> [^|]*? (?=}}|\|) ) # eat till a | or end of tag
\K # don't include stuff to the left of \K in the match
\| # the |
(?=.*?}}) # just to make sure the tag is closed
/^/gsx;
Input:
|}}
| Re|eased = {{start-date|June 14^, {|1972}|x}}
| Released = {{Start date}|1973|03|01}|df=y|}}
| || {{|}} {{ |
Output:
|}}
| Re|eased = {{start-date^June 14^, {^1972}^x}}
| Released = {{Start date}^1973^03^01}^df=y^}}
| || {{^}} {{ |
Example: http://ideone.com/fbY2W
This may not be the most concise way to do it, but it's the first working method I came up with.
my $new;
for ( split /({{.*?}})/ ) {
s/\|/^/g if /^{{/;
$new .= $_;
}
$_ = $new;
s{({{.*?}})}
{my $x = $1;
$x =~ tr/|/^/;
$x
}ge;
精彩评论