Regex for capturing a character between a group but ignoring the ones in a nested group
This is for make something similar to "templates" in MediaWiki with PHP in order to make the parameters between nested templates work.开发者_高级运维
Is possible with a regex to capture all occurrences of a character between braces but ignoring occurrences of it if it occurs in a nested group of braces?
| {{ | {{ | }} | | }} |
Highlighted:
| {{ *|* {{ | }} *|* *|* }} |
No, you must write a context-free grammar (or Perl recursive regexps) to parse it. What are the ignored nested templates replaced by?
The parser will look like this in pseudocode:
input = "| {{ | {{ | }} | | }} |", pointer = 0;
char = '', results = [];
read_next_char() {
return input[++ pointer];
}
go_back_one_char() {
pointer --;
}
while (char = read_next_char()) {
if (char == '{') {
if (read_next_char() == '{') InsideBraces();
else go_back_one_char();
}
}
InsideBraces(skipping=false) {
result = "";
while (char = read_next_char()) {
if (char == '{') {
if (read_next_char() == '}') InsideBraces();
else go_back_one_char();
} else if (char == '}') {
if (read_next_char() == '}') break;
else go_back_one_char();
} else {
result += char;
}
}
if (!skipping) results.push(result);
}
m/.*{{([^{]+)}}/
that would capture a group between {{ and }} so long as '{' wasn't present - syntax is perl
though this is better done with a parser
edited again.
精彩评论