开发者

preg_replace: wildcards do not match umlaut-characters

i want to filter a String by using the \w wildcard, but unfortunately it does not cover umlauts.

$i = "Die Höhe";    
$x = preg_replace("/[^\w\s]/","",$i);
echo $x; // "Die Hhe";

However, i can add all the characters to preg_replace, but this is not very elegant, since the list will become very long. ATM, i am preparing this only for German, but there are more languages to come.

$i = "Die Höhe";    
$x = preg_replace("/[^\w\säöüÄÖÜß]/","",$i);
echo $x; // "Die Höhe";

Is开发者_JAVA百科 there a way to match all of them at once?


You strings are obviously UTF-8, so you want the 'u' flag and unicode properties instead of \w

$x = preg_replace('/[^\p{L}\p{N} ]/u',"",$i);


this should remove all, in my opinion, non meaningful chars:

$val = "Die Höhe";
$val = preg_replace('/[^\x20-\x7e\xa1-\xff]+/u', '', $val);
echo $val; // "Die Höhe"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜