php urldecode utf8 encoding question

2023-02-16 04:02 问答作者：

when I'm trying to _GET url with urlencoded value (some cyrilic word):

http://example.com/?action=search&q=%E0%E2%F2%EE%EC%EE%E1%E8%EB%FC

after decoding:

echo urldecode($_GET['q']); // it prints: ���������

so, I need do conversion to utf-8 (because whole my application works with utf-8) via:

mb_convert_encoding($_GET['q'], "UTF-8", "windows-1251");

and it helps, but question:

Who/what says it should be EXACTLY "windows-1251" ? where from it comes? if i'll use some other languages, how I c开发者_Python百科an define appropriate encoding? where is the magic?

(update): page encoding is utf-8 (update): actually, urldecode($_GET['q']) even not needed, looks like apache+php module doing everything, but, still can't understand where configs are

The answer is that you can't know that for sure, as it might change from request to request, especially if it is not always submitted from form, but sometimes send with ajax, or typed directly in address bar by user.

I work with an appliction which is Polish language. The application works with ISO-8859-2 codepage, and all the html output is served in this encoding.

The application receives request in two different encodings, depending on the context of request:

If the request is made as a result of form submit, then the encoding is the same as the html page with the submitted form. I think it could be altered with accept-charset attribute of form element, but I have not tried it.
If the request is made with Ajax then it is always UTF-8 (at least in Chrome and Firefox, as our client uses only those browsers).
If the request is manually entered into the URL, then it is usually UTF-8, but if it was a bookmark or something like that, then it might be other encoding (depends on how the bookmark was created).

So, really no way to know for sure. If you can, always use UTF-8. Otherwise use charset detection (check if it is UTF-8, if not fall back to the most probable encoding based on the language your application is using).

I use the following code:

<?php
$t = 'zażółć gęślą jaźń';
echo mb_detect_encoding($t, 'UTF-8,ISO-8859-2');

Best regards, SWilk

it is not apache nor mod_php issue. PHP does decode urlencoding automatically but it doesn't encode anything, so, there is nothing to worry about

as it seems from this

when typing in Firefox3 example.com/?action=search&q=автомобиль it converts automatically to: example.com/?action=search&q=%E0%E2%F2%EE%EC%EE%E1%E8%EB%FC

it's more like browser or operation system issue.

it seems that your OS encoding is single-byte and browser does urlencode your single-byte string.

You should keep UTF8 and set your page's charset to UTF8 using the appropriate content-type header:

header('Content-type: text/html; charset=utf-8');

When you type non-ASCII characters directly into the URL search bar, the browser seems to automatically convert the characters into UTF-8 and URL encoded entities. I have no hard data on this but the behaviour makes sense. Related question here: Unicode characters in URLs

Your page is using windows-1252 or some other single-byte character set as its output encoding, which is why you need to convert the character data first.

You could change your page's output encoding to UTF-8 to save yourself that step, but that may have other consequences (like the need to use multi-byte string functions and/or a different encoding for database output, etc.)

windows-1251 is an 8-bit character encoding designed to cover languages that use Cyrillic alphabets. Wiki

You might have set the charset to windows-1251 in your webpage

I also met this problem. I use adobe dreameweaver cs4 (non english version)

I solve it as below:

add header('Content-type: text/html; charset=utf-8'); at the top of the PHP page file.
IMPORTANT In adobe dreameweaver, you should modify Page Properties from the top menu Modify (M) -> Page Properties (P), choose Title/coding and modify unicode to unicode (uft-8) handly.

(sorry, these menu words are translated to english, maybe not the real words)

继续阅读：encoding php url-encoding

php urldecode utf8 encoding question

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？