开发者

Recursive regex in Perl

I've got strings like these:

| Released    = {{start-date|June 14, 1972}}
| Released    = {{Start date|1973|03|01|df=y}} 

I'd like to replace all | within {{ }} with ^

| Released    = {{start-date^June 14, 1972}}
| Released    = {{Start date^1973^03^01^df=y}} 

I can't use substring replacement because there are | symbols outside {{ }}, which must 开发者_Go百科be left intact. And because I don't know exactly how many parts does the string in {{ }} have, I can't use something like s/{{(.+?)\|(.+?)}}/{{$1^$2}}/.

I suppose I need to use some kind of recursion here?


A simple solution:

s/\|(?=[^{}\n]*}})/^/g

Even simpler solution, but probably broken in many cases:

s/(?!^)\|/^/gm

Here is a bit more robust regex:

s/(?:\G(?!^)(?:(?>[^|]*?}})(?>.*?{{))*|^(?>.*?{{))(?>[^|]*?(?=}}|\|))\K\|(?=.*?}})/^/gs;

Commented:

s/
(?:
  \G(?!^)                       # inside of a {{}} tag
  (?: (?>[^|]*?}}) (?>.*?{{) )* # read till we find a | in another tag if none in current
  |
  ^(?>.*?{{)                    # outside of tag, parse till in
)
(?> [^|]*? (?=}}|\|) )          # eat till a | or end of tag
\K                              # don't include stuff to the left of \K in the match
\|                              # the |
(?=.*?}})                       # just to make sure the tag is closed
/^/gsx;

Input:

|}}
| Re|eased    = {{start-date|June 14^, {|1972}|x}}
| Released    = {{Start date}|1973|03|01}|df=y|}}
| || {{|}} {{ |

Output:

|}}
| Re|eased    = {{start-date^June 14^, {^1972}^x}}
| Released    = {{Start date}^1973^03^01}^df=y^}}
| || {{^}} {{ |

Example: http://ideone.com/fbY2W


This may not be the most concise way to do it, but it's the first working method I came up with.

my $new;
for ( split /({{.*?}})/ ) {
    s/\|/^/g if /^{{/;
    $new .= $_;
}
$_ = $new;


s{({{.*?}})}
 {my $x = $1;
  $x =~ tr/|/^/;
  $x
 }ge;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜