regular expression to remove substrings delimited by matching double braces
I have a string like this:
adfs开发者_如何学编程df dsf {{sadfsdfadf {{Infobox}} musical}} jljlk }}
I want eliminate all {{..}}
substrings. I tried
\{\{.*\}\}
which eliminates {{sadfsdfadf{{Infobox}} musical}} jljlk }}
but I want eliminate {{sadfsdfadf {{Infobox}} musical}}
, checking the }}
closer to the start of the substring.
How can I do this?
Use a lazy quantifier:
\{\{.*?\}\}
Here's a fairly non-robust expression \{\{[a-zA-Z\s]*\}\}
that will work.
In the general case, this won't be possible with regular expressions. You cannot match balanced parentheses, or anything like that, with a regular expression-- you need a context-free grammar instead.
That said, Perl has some facilities for recursive regular expressions; these would allow you to do what you want. I do not know if Ruby is capable of doing the same thing.
Here is a quick example using a recent 1.9.x Ruby version. If you run an 1.8.x release you'll need the oniguruma gem. This doesn't take into account escaped \{\{
but does handle single {
and }
which I assume you will want to ignore.
#!/usr/bin/evn ruby
# Old 1.8.x versions of Ruby you'll need the gem.
# require 'oniguruma'
require 'pp'
squiggly = %r/
(
(?<squiggly> # squiggly named group
\{\{ # start {{
(?: # non matching group
[^{}] # anything not { or }
| \{[^{] # any { not followed by {
| \}[^}] # any } not followed by }
| \g<squiggly> # nested squiggly
)* # zero or more times
\}\} # end }}
) # end of squiggly
)/x
string = 'adfsdf dsf {{sadfsdfadf {{Infobox}} musical}} jljlk }}'
pp squiggly.match(string)[:squiggly] #=> {{sadfsdfadf {{Infobox}} musical}}
精彩评论