开发者

PHP encoding from ISO-8859-1 to UTF-8

<?php
mb_internal_encoding('UTF-8');
mb_language('uni');
$a=file_get_contents("http://www.ciao.de/Erfahrungsberichte/8x4_Wild_Flower_Deo_Spray__8937431");
preg_match('/dass auf dem Versch(.*)ziehen mich/Us',$a,$b);
$b=$b[1];
echo $b."\n";
echo utf8_encode($b)."\n";
echo mb_convert_encoding($b,'UTF-8','iso-8859-1')."\n";

results in

lussdeckel riesengro▒ und un▒bersehbar glitzernd ein ▒New▒ prangt. Neue Produkte
lussdeckel riesengroß und unübersehbar glitzernd ein �New� prangt. Neue Produkte
luss开发者_开发知识库deckel riesengroß und unübersehbar glitzernd ein �New� prangt. Neue Produkte

HTTP source code suggests in meta tag to use "iso-8859-1". German umlauts are fine, but why are the quotes around "New" not converted correctly? In PHP manual there is a function fix_latin. When using this function the quotes are also converted correctly!?

PS: same occurs with european currency symbol € (EUR) - it's also converted wrong (except with the fix_latin function), but why?


Euro sign is not in ISO-8859-1. (ISO-8859-15 was created for that purpose.)

Best I recollect, mb_convert_encoding() will not transliterate characters. Consider using iconv() instead. And/or be sure to set the content-type header as needed.

In the next PHP version there will also be the Transliterator class, which wraps ICU.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜