开发者

Charset problems with PHP

I have a problem with a PHP code that transforms accent characters in non accent characters. I have this code working a year ago but I'm trying to get this to work but without success. The translation is no开发者_如何学Ct done correctly.

Here is the code:

<?php

echo accentdestroyer('azeméis');

    /**
     * 
     * This function transform accent characters to non accent characters
     * @param text $string
     */
    function accentdestroyer($string) {
        $string=strtr($string,
        "()!$?: ,&+-/.ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýÿ"
        ,
        "-------------SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy");

        return $string;
    }

?>

I have tested to save the document in UTF-8 but gives me something like this: "azemy�is"

Some clues on what can I do to get this working correctly?

Best Regards,


A better solution may be to transliterate those characters automatically using iconv().

As for the reason your function doesn't work, it may have something to do with the fact that echo strlen('Š'); outputs 2. The documentation explicitly refers to single byte characters.

Also,

$a = 'Š';

var_dump(strtr('Š', 'Š', '!')); // string(2) "!�"

So the first byte has been matched but the second one (leftover) isn't a byte pointing to a valid Unicode character.

Update

Here is a workign example using iconv().

$str = 'ŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚ';
 
$str = iconv("utf-8", "us-ascii//TRANSLIT", $str); 
 
var_dump($str); // string(37) "OEZsoezY?uAAAAAAAECEEEEIIII?NOOOOO?UU"

Some characters didn't quite translate, such as ¥ and Ø, but most did. You can append //IGNORE to the output character set to silently discard the ones which don't transliterate.

You could also drop all non word characters too using a Unicode regex with \pL.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜