开发者

How can i code and decode urls from IDN in php?

im doing a site to check, register, etc of domains, i have to make it IDN compliant. Right now i have something like this:

echo $domain;  开发者_如何学Python     
$domain = idn_to_ascii($domain);
echo $domain;
$domain = idn_to_utf8($domain);
echo $domain;

and im getting this:

testing123ásd123 xn--testing123sd123-wjb testing123ĂĄsd123

as you can see the decoded string isnt the same as the original i also tried using a class by http://phlymail.com/en/downloads/idna/download/ to do it and im getting the same results

i have tried using:

$charset="UTF-8";
echo $domain;       
$domain = idn_to_ascii($domain, $charset);
echo $domain;
$domain = idn_to_utf8($domain);
echo $domain;

and i got exactly the same (except that the encoded string is slightly different)

any ideas?

EDIT: Problem solved! with this Problem in converting string to puny code (in PHP, using phlyLabs's punycode string converter) the original string was in iso-8859-2 and the decoded in UTF-8, now i need to find how to make it iso-8859-2 again but google can help me with that. Any mods? what should i do with the question? close it, erase it? leave it this way?


As you already point out, ĂĄ appears to be the UTF8 representation of the á character as displayed in a non-UTF8 document.

You can use iconv() to convert between charsets. However, be aware that charsets that are not Unicode cannot represent the full set of international characters so must convert missing chars to HTML entities. E.g.:

<?php

$domain = idn_to_utf8($domain);
echo htmlentities($domain, ENT_COMPAT, 'UTF-8');

?>

In any case, it'd probably be easier to just use UTF-8 for the whole project.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜