开发者

PHP preg_replace oddity with £ pound sign and ã

I am applying the following function

<?php

function replaceChar($string){
    $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâäåìíîïùúûüýÿ]/", "", $string);
    return $new_string;
}

$string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿ";

echo replaceChar($string);
?>

which works fine but if I add ã to the preg_replace like

$new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâãäåìíîïùúûüýÿ]/", "", $string);

$string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿã";

It conflicts with the pound sign £ and replaces the pound sign with the unidentified question mark in black square.

This is not critical but does anyone know why this is?

Thank you,

Barry

UPDATE: Thank you all. Changed functions adding the u modifier: pt2.php.net/manual/en/… – as suggested by Artefacto and works a treat

function replaceChar($string){
$new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõøöàáâãäåìíîïùúûüýÿ]/u", "", $st开发者_运维问答ring);
return $new_string;
}


If your string is in UTF-8, you must add the u modifier to the regex. Like this:

function replaceChar($string){
    $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâäåìíîïùúûüýÿ]/u", "", $string);
    return $new_string;
}

$string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿ";

echo replaceChar($string);


Chances are that your string is UTF-8, but preg_replace() is working on bytes


that code is valid ...

maybe you should try Central-European character encoding

<?php
header ('Content-type: text/html; charset=ISO-8859-2');
?>


You might want to take a look at mb_ereg_replace(). As Mark mentioned preg_replace only works on byte level and does not work well with multibyte character encodings.

Cheers,
Fabian

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜