Some unclear PHP syntax
I am a PHP beginner and saw on the forum this PHP expression:
My PHP version is 5.2.X ()
$regex = <<<'END'
开发者_StackOverflow中文版/
( [\x00-\x7F] # single-byte sequences 0xxxxxxx
| [\xC0-\xDF][\x80-\xBF] # double-byte sequences 110xxxxx 10xxxxxx
| [\xE0-\xEF][\x80-\xBF]{2} # triple-byte sequences 1110xxxx 10xxxxxx * 2
| [\xF0-\xF7][\x80-\xBF]{3} # quadruple-byte sequence 11110xxx 10xxxxxx * 3
)
| ( [\x80-\xBF] ) # invalid byte in range 10000000 - 10111111
| ( [\xC0-\xFF] ) # invalid byte in range 11000000 - 11111111
/x
END;
Is this code correct? What do these strange (for me) constructions like <<<
, 'END'
, /
, /x
, and END;
mean?
My PHP version does not support nowdoc, how should I replace this expression? without quotes 'END'
$regex became NULL
I recieve:
Parse error: syntax error, unexpected T_SL in /home/vhosts/mysite.com/public_html/mypage.php on line X
Thanks
Parse error: syntax error, unexpected T_SL in /home/vhosts/mysite.com/public_html/mypage.php on line X
This comes from the 's around END. This is called nowdoc, which was added in PHP 5.3. Since you're using PHP 5.2, and this regex uses '\x', you'll need a quoted string or you'll need to escape the '\'s.
An example of the regex as a quoted string, used in this answer:
$regex = '/
( [\x00-\x7F] # single-byte sequences 0xxxxxxx
| [\xC0-\xDF][\x80-\xBF] # double-byte sequences 110xxxxx 10xxxxxx
| [\xE0-\xEF][\x80-\xBF]{2} # triple-byte sequences 1110xxxx 10xxxxxx * 2
| [\xF0-\xF7][\x80-\xBF]{3} # quadruple-byte sequence 11110xxx 10xxxxxx * 3
)
| ( [\x80-\xBF] ) # invalid byte in range 10000000 - 10111111
| ( [\xC0-\xFF] ) # invalid byte in range 11000000 - 11111111
/x
';
The "/" and "/x" portions are control characters in the regex. The "/"s mark the beginning and end, and the meaning of the x flag (PCRE_EXTENDED) is defined in: http://us.php.net/manual/en/reference.pcre.pattern.modifiers.php
<<<
and END
are called heredoc syntax - a way of quoting a large amount of data to a variable.
$mytext = <<<TXT
this is my text and it
can be many lines
etc
etc
TXT;
The three characters (here TXT, END in your example) can be whatever you like although they must be alphanumeric as far as I'm aware.
Read more at the manual
It's heredoc syntax.
The <<< 'END'
says that it's the start of a string and that everything until the next appearance of "END" will be part of the string (even newlines).
The /
and /x
are actually part of the regex.
In addition to what other users have said about it being heredoc syntax (typically used for large strings that would otherwise require a lot of escaping), the code is defining a regular expression using "/" as the deliminator.
the "/x" at the end is closing the regular expression and then telling the regex engine to execute it in "free-spacing mode". Other possible options would have been /i for case-insensitive or /m for multi-line mode.
You can read more about PHP's regex engine here:
Using Regular Expressions in PHP
精彩评论