Regular Expression- Help needed
I have a String template from which I need to get the list of #elseif blocks. For example the first #elseif block will be from
#elseif ( $variable2 )Some sample text after 1st ElseIf.
,second #elseif block is from #elseif($variable2)This text can be repeated many times until do while is called. SECOND ELSEIF
and so on. I'm using the following regex for this.
String regexElseIf="\\#elseif\\s*\\((.*?)\\)(.*?)(?:#elseif|#else|#endif)";
But it returns just one match, ie first #elseif block and not second. I need to get the second #elseif block also. Could you please help me to do that? Please find the below string template.
String template =
"This is a sample document."
+ "#if ( $variable1 )"
+ "FIRST This text can be repeated many times until do while is called."
开发者_StackOverflow中文版 + "#elseif ( $variable2 )"
+ "Some sample text after 1st ElseIf."
+ "#elseif($variable2)"
+ "This text can be repeated many times until do while is called. SECOND ELSEIF"
+ "#else "
+ "sample else condition "
+ "#endif "
+ "Some sample text."
+ "This is the second sample document."
+ "#if ( $variable1 )"
+ "SECOND FIRST This text can be repeated many times until do while is called."
+ "#elseif ( $variable2 )"
+ "SECOND Some sample text after 1st ElseIf."
+ "#elseif($variable2)"
+ "SECOND This text can be repeated many times until do while is called. SECOND ELSEIF"
+ "#else " + "SECOND sample else condition " + "#endif "
+ "SECOND Some sample text.";
This code
Pattern regexp = Pattern.compile("#elseif\\b(.*?)(?=#(elseif|else|endif))");
Matcher matcher = regexp.matcher(template);
while (matcher.find())
System.out.println(matcher.group());
will produce
#elseif ( $variable2 )Some sample text after 1st ElseIf.
#elseif($variable2)This text can be repeated many times until do while is called. SECOND ELSEIF
#elseif ( $variable2 )SECOND Some sample text after 1st ElseIf.
#elseif($variable2)SECOND This text can be repeated many times until do while is called. SECOND ELSEIF
The secret lies in the positive lookahead (?=#(elseif|else|endif))
, so #elseif
, #else
or #endif
will be matched, but the characters are not consumed. This way they could be found by the next iteration.
#elseif\b(?:(?!#else\b|#endif\b).)*
will match everything from the first #elseif
in a block up to (but not including) the nearest #else
or #endif
.
Pattern regex = Pattern.compile("#elseif\\b(?:(?!#else\\b|#endif\\b).)*", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// matched text: regexMatcher.group()
// match start: regexMatcher.start()
// match end: regexMatcher.end()
}
If you then need to extract the single ´#elseif` blocks from that match, use
#elseif\b(?:(?!#elseif\b).)*
on the results from the first regex match above. In Java:
Pattern regex = Pattern.compile("#elseif\\b(?:(?!#elseif\\b).)*", Pattern.DOTALL);
etc.
The big problem here is that you need #elseif(..)
both as a start and stop marker in your regular expression. The first match is the substring
#elseif ( $variable2 )Some sample text after 1st ElseIf.#elseif($variable2)
and then it starts looking for the next match after that sequence. So it will miss the second #elseif
from the first #if
expression, because the #elseif($variable2)
sequence was already part of the previous match.
I'd try to split the string on the pattern "\\#elseif\\s*\\((.*?)\\)"
:
String[] temp = template.split("\\#elseif\\s*\\((.*?)\\)");
Now all temp entries starting from temp[1]
have an #elseif
block at their beginning. Another split on (?:#else|#endif)
should give you strings containing nothing but the plain texts:
for (String s:temp)
System.out.println(s.split("(?:#else|#endif)")[0]);
(wasn't able to test the second split, if it doesn't work, treat it as an advice on the strategy only ;))
private static final Pattern REGEX = Pattern.compile(
"#elseif\\s*\\(([^()]*)\\)(.*?)(?=#elseif|#else|#endif)");
public static void main(String[] args) {
Matcher matcher = REGEX.matcher(template);
while (matcher.find()) {
System.out.println(matcher.group(2));
}
}
精彩评论