Strip all non-alphanumeric, spaces and punctuation symbols from a string
How can I use PHP to strip out all characters that are NOT letters, numbers, spaces, or punctuation marks?
I've tried the foll开发者_如何学JAVAowing, but it strips punctuation.
preg_replace("/[^a-zA-Z0-9\s]/", "", $str);
preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", $str);
Example:
php > echo preg_replace("/[^a-zA-Z0-9\s\p{P}]/", "", "⟺f✆oo☃. ba⟗r!");
foo. bar!
\p{P}
matches all Unicode punctuation characters (see Unicode character properties). If you only want to allow specific punctuation, simply add them to the negated character class. E.g:
preg_replace("/[^a-zA-Z0-9\s.?!]/", "", $str);
You're going to have to list the punctuation explicitly as there is no shorthand for that (eg \s
is shorthand for white space characters).
preg_replace('/[^a-zA-Z0-9\s\-=+\|!@#$%^&*()`~\[\]{};:\'",<.>\/?]/', '', $str);
$str = trim($str);
$str = trim($str, "\x00..\x1F");
$str = str_replace(array( ""","'","&","<",">"),' ',$str);
$str = preg_replace('/[^0-9a-zA-Z-]/', ' ', $str);
$str = preg_replace('/\s\s+/', ' ', $str);
$str = trim($str);
$str = preg_replace('/[ ]/', '-', $str);
Hope this helps.
Let's build a multibyte-safe/unicode-safe pattern for this task.
From https://www.regular-expressions.info/unicode.html:
\p{L} or \p{Letter}: any kind of letter from any language.
\p{Z} or \p{Separator}: any kind of whitespace or invisible separator.
\p{N} or \p{Number}: any kind of numeric character in any script.
\p{P} or \p{Punctuation}: any kind of punctuation character.
[^ ... ]
is a negated character class that matches any character not in the list.+
is a "one or more" quantifier.u
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. An invalid subject will cause the preg_* function to match nothing; an invalid pattern will trigger an error of level E_WARNING. Five and six octet UTF-8 sequences are regarded as invalid.
Code: (Demo)
echo preg_replace('/[^\p{L}\p{Z}\p{N}\p{P}]+/u', '', $string);
精彩评论