开发者

Using regex variables in preg_match_all

I'm building a pseudo-variable parser in PHP to allow clean and simple syntax in views, and have implemented an engine for if-statements. Now I want to be able to use nested if-statements and the simplest, most elegant solution I thought of was to use identation as the marker for a block.

So this is basicly the layout I'm looking for in the view:

{if x is empty}
    {if y is array}
        Hello World
    {endif}
{endif}

The script would find the first if-statement and match it with the endif on the same depth. If it evaluates to true the inside block will be 开发者_开发百科parsed as well.

Now, I'm having trouble setting up the regular expression to use depth in the following code:

preg_match_all('|([\t ]?){if (.+?) is (.+?)}(.+?){endif}|s', $template, $match);

Basically I want the first match in ([\t ]?) to be placed before {endif} using some kind of variable, and make sure that the statement won't be complete if there is no matching {endif} on the same depth as the {if ...}.

Can you help me complete it?


You cannot in general use regular expressions for this problem, because the language you've defined is not regular (since it requires counting occurrences of {if} and {endif}).

What you've got is a variant of the classic matching parentheses problem.

You'd be better off using some kind of Finite-state machine to keep track of occurrences of {if} and {endif}.


\1 will contain your first match, so you could do this:

preg_match_all('|([\t ]?){if (.+?) is (.+?)}(.+?)\1{endif}|s', $template, $match);

However, this won't necessarily match the correct occurrence of {endif}, only the next occurrence that is preceded by the correct number of tabs.


if you're using explicit "endif" statement, your blocks are already closed, and there's no need to do any special indentation, just match '/{if.+?}(.+?){endif}/'.1 If, on the other side, you what python-alike indent blocks, you can get rid of {endif} (and brackets) and only match indent levels

  if condition
      this line is in the if block
      this one too
  this one not

your expression will be like this

 /([\t]+)if(.+)\n((\1[\t]+.+\n)+)/

"condition" will be match 2 and statement body match 3

1 actually this should be something like /{if}((.(?!{if))+?){endif}/s to handle nested if's properly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜