开发者

Php regex with safe delimiters

I've thought that php's perl compatible regular expression (preg library) supports curly brackets as delimiters. This should be fine:

{ello {world}i // should match on Hello {World

The main point of curly brackets is that it only takes the most left and right ones, thus requiring no escaping for the inner ones. As far as I know, php req开发者_如何学运维uires the escaping

{ello \{world}i // this actually matches on Hello {World

Is this the expected behavior or bug in php preg implementation?


When in Perl you use for the pattern delimiter any of the four paired ASCII bracket types, you only need to escape unpaired brackets within the pattern. This is indeed the entire purpose of using brackets. This is documented in the perlop manpage under “Quote and Quote-like Operators”, which reads in part:

   Non-bracketing delimiters use the same character fore and aft, 
   but the four sorts of brackets (round, angle, square, curly) 
   will all nest, which means that

      q{foo{bar}baz}

   is the same as

      'foo{bar}baz'

   Note, however, that this does not always work for quoting Perl code:

      $s = q{ if($a eq "}") ... }; # WRONG

That’s why you often see people use m{…} or qr{…} in Perl code, especially for multiline patterns used with /x ᴀᴋᴀ (?x). For example:

return qr{                  
    (?=                     # pure lookahead for conjunctive matching
        \A                  # always from start
        . *?                # going only as far as we need to to find the pattern
        (?:
            ${case_flag}
            ${left_boundary}
            ${positive_pattern}
            ${right_boundary}
        )
    )
}sxm;

Notice how those nested braces are no problem.


Expected behavior as far as I know, otherwise how else would the compiler allow group limiters? e.g.

[a-z]{1,5}


From http://lv.php.net/manual/en/regexp.reference.delimiters.php:

If the delimiter needs to be matched inside the pattern it must be escaped using a backslash. If the delimiter appears often inside the pattern, it is a good idea to choose another delimiter in order to increase readability.

So this is expected behavior, not a bug.


I found that no escaping is required in this case:

'ello {world'i
(ello {world)i

So my theory is, that the problem is with the '{' delimiters only. Also, the following two produce the same error:

{ello {world}i
(ello (world)i

Using starting/ending braces as delimiters may require to escape the given braces in the expression.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜