开发者

Character encoding issues - UTF-8 / Issue while transmitting data on the internet?

I've got data being sent from a client side which is sending it like this:

// $booktitle = "Comí habitación bailé"

$xml_obj = new DOMDocument('1.0', 'utf-8');

// node created with booktitle and added to xml_obj 
// NO htmlentities / other transformations done

$returnHeader = drupal_http_request($url, $headers = array("Content-Type:  text/xml; charset=utf-8"), $method = 'POST', $data = $xml_data, $retry = 3);

When I receive it at my end (via that drupal_http_request) and I do htmlentities on it, I get the following:

 Comí habitación bailé

Which when displayed looks like gibberish:

 Comí Habitación Bailé

What is going wrong?


Edit 1)

<?php
$title = "Comí habitación bailé";
echo "title=$title\n";
echo 'encoding is '.mb_detect_encoding($title);
$heutf8 = htmlentities($title, ENT_COMPAT, "UTF-8");
echo "heutf8=$heutf8\n";
?>

Running this test script on a Windows machine and redirecting to a file shows:

title=Comí habitación bailé
encoding is UTF-8heutf8=

Running this on a linux system:

title=Comí habitación bailé
encoding is UTF-8PHP Warning:  htmlenti开发者_JAVA技巧ties(): Invalid multibyte sequence in argument in /home/testaccount/public_html/test2.php on line 5
heutf8=


I think you shouldn't encode the entities with htmlentities just for outputting it correctly (you should as stated in the comments use htmlspecialchars to avoid cross side scripting) , just set the correct headers and meta end echo the values normally:

<?php
 header ('Content-type: text/html; charset=utf-8');
 ?>
 <html>
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 </head>
 <body>

 </body>
 </html>


htmlentities interprets its input as ISO-8859-1 by default; are you passing UTF-8 for the charset parameter?


Try passing headers information in a key/value array format.

Something like

$headers = array("Content-Type" => "text/xml; charset=utf-8"")

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜