开发者

Some unclear PHP syntax

I am a PHP beginner and saw on the forum this PHP expression:

My PHP version is 5.2.X ()

$regex = <<<'END'
开发者_StackOverflow中文版/
  ( [\x00-\x7F]                 # single-byte sequences   0xxxxxxx
  | [\xC0-\xDF][\x80-\xBF]      # double-byte sequences   110xxxxx 10xxxxxx
  | [\xE0-\xEF][\x80-\xBF]{2}   # triple-byte sequences   1110xxxx 10xxxxxx * 2
  | [\xF0-\xF7][\x80-\xBF]{3}   # quadruple-byte sequence 11110xxx 10xxxxxx * 3 
  )
| ( [\x80-\xBF] )               # invalid byte in range 10000000 - 10111111
| ( [\xC0-\xFF] )               # invalid byte in range 11000000 - 11111111
/x
END;

Is this code correct? What do these strange (for me) constructions like <<<, 'END', /, /x, and END; mean?

My PHP version does not support nowdoc, how should I replace this expression? without quotes 'END' $regex became NULL

I recieve:

Parse error: syntax error, unexpected T_SL in /home/vhosts/mysite.com/public_html/mypage.php on line X

Thanks


Parse error: syntax error, unexpected T_SL in /home/vhosts/mysite.com/public_html/mypage.php on line X

This comes from the 's around END. This is called nowdoc, which was added in PHP 5.3. Since you're using PHP 5.2, and this regex uses '\x', you'll need a quoted string or you'll need to escape the '\'s.

An example of the regex as a quoted string, used in this answer:

$regex = '/
( [\x00-\x7F]                 # single-byte sequences   0xxxxxxx
  | [\xC0-\xDF][\x80-\xBF]      # double-byte sequences   110xxxxx 10xxxxxx
  | [\xE0-\xEF][\x80-\xBF]{2}   # triple-byte sequences   1110xxxx 10xxxxxx * 2
  | [\xF0-\xF7][\x80-\xBF]{3}   # quadruple-byte sequence 11110xxx 10xxxxxx * 3
  )
| ( [\x80-\xBF] )               # invalid byte in range 10000000 - 10111111
| ( [\xC0-\xFF] )               # invalid byte in range 11000000 - 11111111
/x
';

The "/" and "/x" portions are control characters in the regex. The "/"s mark the beginning and end, and the meaning of the x flag (PCRE_EXTENDED) is defined in: http://us.php.net/manual/en/reference.pcre.pattern.modifiers.php


<<< and END are called heredoc syntax - a way of quoting a large amount of data to a variable.

$mytext = <<<TXT

this is my text and it
can be many lines
etc
etc

TXT;

The three characters (here TXT, END in your example) can be whatever you like although they must be alphanumeric as far as I'm aware.

Read more at the manual


It's heredoc syntax.

The <<< 'END' says that it's the start of a string and that everything until the next appearance of "END" will be part of the string (even newlines).

The / and /x are actually part of the regex.


In addition to what other users have said about it being heredoc syntax (typically used for large strings that would otherwise require a lot of escaping), the code is defining a regular expression using "/" as the deliminator.

the "/x" at the end is closing the regular expression and then telling the regex engine to execute it in "free-spacing mode". Other possible options would have been /i for case-insensitive or /m for multi-line mode.

You can read more about PHP's regex engine here:

Using Regular Expressions in PHP

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜