regular expression to match nested braces
I need regular expression to match braces correct e.g for every open one close one
abc{abc{b开发者_运维百科c}xyz}
I need it get all it from {abc{bc}xyz}
not get {abc{bc}
.
I tried using (\{.*?})
This is not possible with regular expressions. A context-free grammar would be necessary for this and regular expressions only work for finite regular languages.
According to this link there is an extension available for the regular expressions in .NET that can do this, but this just means that .NET regular expressions are more than just regular expressions.
This is not a task for a regular expression. What you're looking for is parser at that point. Which means a language grammar, LL(1), LALR, recursive-descent, the dragon book, and generally a splitting migraine.
Balanced parenthesis of arbitrary nested depth is not a regular language. It's a context-free language.
That said, many "regular expression" implementations actually recognize more than regular languages, so this is possible with some implementation but not others.
Wikipedia
- Regular language
- Pumping lemma for regular languages
- Context-free language
- Regular expression
Many features found in modern regular expression libraries provide an expressive power that far exceeds the regular languages.
As Bryan said, regular expressions might not be the right tool here, but if you're using PHP, the manual gives an example of how you might be able to use regular expressions in a recursive/nested fashion:
$input = "plain [indent] deep [indent] deeper [/indent] deep [/indent] plain";
function parseTagsRecursive($input)
{
$regex = '#\[indent]((?:[^[]|\[(?!/?indent])|(?R))+)\[/indent]#';
if (is_array($input)) {
$input = '<div style="margin-left: 10px">'.$input[1].'</div>';
}
return preg_replace_callback($regex, 'parseTagsRecursive', $input);
}
$output = parseTagsRecursive($input);
echo $output;
I'm not sure if that'll be helpful to you or not.
This is not possible in the "standard" regular expression language. However, a few different implementations have extensions that allow you to implement it. For example, here's a blog post that explains how to do it with .NET's regex library.
Generally speaking though, this is a task that regular expressions are not really suited to.
Assuming what you want to do is select a maximal substring between {
and }
:
.*?
is a lazy quantifier. That is, it will match the least number of characters possible. If you change your expression to {.*}
, you should find it will work.
If what you want to do is to verify that the braces are matched correctly, then as the other answers have stated, this is not possible with a (single) regular expression. You can do it by scanning the string with a stack though. Or with some voodoo of iterating your regular expression over the previous maximal match. Yikes.
精彩评论