Use regular expressions to match an ? but not a \?
I have a PHP regular expression that has been functioning fairly well to parse some odd legacy client templates until recently when we found an escaped question mark (\?) included in a template expression. I'm not strong enough with my regular expression-fu to wrap my feeble noodle around a negative look ahead or some techno-mumbo-jumbo so, tips or points in the right direction would be greatly appreciated.
My PHP:
preg_match_all("/\{\{IF (.*)\?(.*):(.*)\}\}/U", $template, $m, PREG_SET_ORDER);
Okay, I was a little overwhelmed when I posted this question. Allow me to put it into proper context.
Template code looks like this:
{{IF VAR?"SHOW: THIS?":"SHOW {{ELSE}}"}}
Which should be parsed as:
if ($template[$var]) {
echo "SHOW: THIS?";
} else {
echo "SHOW ".$template['ELSE'];
}
I am currently almost achieving this with my function, but not entirely. This is the function:
preg_match_all("/\{\{IF ((?:[^\\?]|\\.)*)\?((?:[^\\:]|\\.)*):(.*)\}\}[^<\/]/", $template, $m, PREG_SET_ORDER);
if (count($m)) {
foreach ($m as $o) {
if (preg_match("/(.*)\s+(==|!=)\s+(.*)/", $o[1], $x)) {
if (preg_match("/^\"(.*)\"/", $x[1], $cx)) $e1 = $cx[1];
else $e1 = is_numeric($x[1])?$x[1]:$data[$x[1]];
if (preg_match("/^\"(.*)\"/", $x[3], $cx)) $e2 = $cx[1];
else $e2 = is_numeric($x[3])?$x[3]:$data[$x[3]];
if (preg_match("/^\"(.*)\"/", $o[2], $ox)) $er[0] = $ox[1];
else $er[0] = addslashes(htmlspecialchars($data[$o[2]]));
if (preg_match("/^\"(.*)\"/", $o[3], $ox)) $er[1] = $ox[1];
else $er[1] = addslashes(htmlspecialchars($data[$o[3]]));
$eval = "\$od = (\"$e1\" $x[2] \"$e2\")?\"$er[0]\":\"$er[1]\";";
eval($eval);
} else {
$od = $data[$o[1]]?$o[2]:$o[3];
if (preg_match("/^\"(.*)\"/", $od, $x)) $od = $x[1];
else $od = $data[$od];
}
$template = str_replace($o[0], $od, $templat开发者_如何学编程e);
}
}
if (is_array($data))
foreach ($data as $k => $v) $template = str_replace('{{'.$k.'}}', $v, $template);
return $template;
You need to change your (.*)
regions—it's no longer true that you want to match a sequence of anything. Instead, you want to match a sequence of non-escaped characters or escape sequences: ((?:[^\\]|\\.)*)
That will match any string containing backslashed escapes. I think you could possibly improve performance by specifying that you don't want to match question marks or colons where you can't; if you did this, you'd end up with the regex /\{\{IF ((?:[^\\?]|\\.)*)\?((?:[^\\:]|\\.)*):(.*)\}\}/
. While that looks nasty, I've just substituted your (.*)
s with the construction I have from above; it's pretty straightforward.
Why not
(.*)[^\\]\?(.*)
Here's what worked. Thanks to @absz for a point in the right direction.
preg_match_all("/\{\{IF ([^\"\\]]*(\\.[^\"\\]]*)*)\?((?:[^\\:]|\\.)*):(.*)}\}/", $template, $m, PREG_SET_ORDER);
精彩评论