开发者

Use regular expressions to match an ? but not a \?

I have a PHP regular expression that has been functioning fairly well to parse some odd legacy client templates until recently when we found an escaped question mark (\?) included in a template expression. I'm not strong enough with my regular expression-fu to wrap my feeble noodle around a negative look ahead or some techno-mumbo-jumbo so, tips or points in the right direction would be greatly appreciated.

My PHP:

preg_match_all("/\{\{IF (.*)\?(.*):(.*)\}\}/U", $template, $m, PREG_SET_ORDER);

Okay, I was a little overwhelmed when I posted this question. Allow me to put it into proper context.

Template code looks like this:

{{IF VAR?"SHOW: THIS?":"SHOW {{ELSE}}"}}

Which should be parsed as:

if ($template[$var]) {
 echo "SHOW: THIS?";
} else {
 echo "SHOW ".$template['ELSE'];
}

I am currently almost achieving this with my function, but not entirely. This is the function:

preg_match_all("/\{\{IF ((?:[^\\?]|\\.)*)\?((?:[^\\:]|\\.)*):(.*)\}\}[^<\/]/", $template, $m, PREG_SET_ORDER);
if (count($m)) {
 foreach ($m as $o) {
  if (preg_match("/(.*)\s+(==|!=)\s+(.*)/", $o[1], $x)) {
   if (preg_match("/^\"(.*)\"/", $x[1], $cx)) $e1 = $cx[1];
   else $e1 = is_numeric($x[1])?$x[1]:$data[$x[1]];
   if (preg_match("/^\"(.*)\"/", $x[3], $cx)) $e2 = $cx[1];
   else $e2 = is_numeric($x[3])?$x[3]:$data[$x[3]];
   if (preg_match("/^\"(.*)\"/", $o[2], $ox)) $er[0] = $ox[1];
   else $er[0] =  addslashes(htmlspecialchars($data[$o[2]]));
   if (preg_match("/^\"(.*)\"/", $o[3], $ox)) $er[1] = $ox[1];
   else $er[1] = addslashes(htmlspecialchars($data[$o[3]]));
   $eval = "\$od = (\"$e1\" $x[2] \"$e2\")?\"$er[0]\":\"$er[1]\";";
   eval($eval);
  } else {
   $od = $data[$o[1]]?$o[2]:$o[3];
   if (preg_match("/^\"(.*)\"/", $od, $x)) $od = $x[1];
   else $od = $data[$od];
  }
  $template = str_replace($o[0], $od, $templat开发者_如何学编程e);
 }
}

if (is_array($data))
 foreach ($data as $k => $v) $template = str_replace('{{'.$k.'}}', $v, $template);
return $template;


You need to change your (.*) regions—it's no longer true that you want to match a sequence of anything. Instead, you want to match a sequence of non-escaped characters or escape sequences: ((?:[^\\]|\\.)*) That will match any string containing backslashed escapes. I think you could possibly improve performance by specifying that you don't want to match question marks or colons where you can't; if you did this, you'd end up with the regex /\{\{IF ((?:[^\\?]|\\.)*)\?((?:[^\\:]|\\.)*):(.*)\}\}/. While that looks nasty, I've just substituted your (.*)s with the construction I have from above; it's pretty straightforward.


Why not

(.*)[^\\]\?(.*)


Here's what worked. Thanks to @absz for a point in the right direction.

preg_match_all("/\{\{IF ([^\"\\]]*(\\.[^\"\\]]*)*)\?((?:[^\\:]|\\.)*):(.*)}\}/", $template, $m, PREG_SET_ORDER);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜