preg_replace: wildcards do not match umlaut-characters
i want to filter a String by using the \w wildcard, but unfortunately it does not cover umlauts.
$i = "Die Höhe";
$x = preg_replace("/[^\w\s]/","",$i);
echo $x; // "Die Hhe";
However, i can add all the characters to preg_replace, but this is not very elegant, since the list will become very long. ATM, i am preparing this only for German, but there are more languages to come.
$i = "Die Höhe";
$x = preg_replace("/[^\w\säöüÄÖÜß]/","",$i);
echo $x; // "Die Höhe";
Is开发者_JAVA百科 there a way to match all of them at once?
You strings are obviously UTF-8, so you want the 'u' flag and unicode properties instead of \w
$x = preg_replace('/[^\p{L}\p{N} ]/u',"",$i);
this should remove all, in my opinion, non meaningful chars:
$val = "Die Höhe";
$val = preg_replace('/[^\x20-\x7e\xa1-\xff]+/u', '', $val);
echo $val; // "Die Höhe"
精彩评论