What is the preg_match_all `u` flag dependent on?
I have some code in a PHP application that is returning null when I try and use it on the production server, but it works fine on the development server. Here is the line of code:
// use the regex unicode support to separate the UTF-8 characters into an array
preg_match_all( '/./us', $str, $match );
What is the u
flag dependent on? I tested with mb_string
enabled and disabled and it does not seem to affect it.
The error I'm getting is
preg_match_all: Compilation failed: unknown option bit(s) set at offset -1
more info
this is one of the options on the prodction server:
'--with-pcre-regex=/opt/pcre'
and here are the pcre sections
I believe this is the note @Wesley was referring to:
In order process UTF-8 strings, you must build PCRE to include UTF-8
support in the code, and, in addition, you must call pcre_compile()
with the PCRE_UTF8 option flag, or the pattern must start with the
sequence (*UTF8). When either of these is the case, both the pattern
and any subject strings that are matched against it are treated as
UTF-8 strings instead of strings of 1-byte characters.
Any links or tips on how to "build PCRE to include UTF-8" ?
via
results of pcretest -C
PCRE version 6.6 06-Feb-2006
Compiled with
UTF-8 support
Unicode properties support
Newline character is LF
Internal link size = 2
POSIX ma开发者_如何学Golloc threshold = 10
Default match limit = 10000000
Default recursion depth limit = 10000000
Match recursion uses stack
This flag depends on PCRE being built with unicode support enabled.
PHP bundles this library and it's normally built with unicode support enabled: The u
modifier is available and always works since PHP 4.1.0, when PHP is built with the bundled PCRE library.
However some Linux distributions build PHP against their own build of PCRE, which do not have unicode support enabled, and as a result the u
modifier doesn't work on those builds.
The solution is to use an alternative PHP package.
It depends on the PCRE being compiled with --enable-utf8.
精彩评论