url-encoding in browser address bar
When I put some non-alpha-numeric symbols in browser address bar,开发者_如何转开发 they got url-encoded. For example, http://ru2.php.net/manual-lookup.php?pattern=привет turns into http://ru2.php.net/manual-lookup.php?pattern=%EF%F0%E8%E2%E5%F2.
The question is: what do those two percent-prefixed hex digits mean?
they are bytes of the Windows 1251 encoding of Cyrillic. Since there are only six of them, they can't be UTF-8, since it takes 12 bytes of UTF-8 for 6 chars of Cyrillic.
The code chart for CP1251 can be found here: http://en.wikipedia.org/wiki/Windows-1251.
Just like 20 is hex for a space, each of the Cyrillic characters has its numeric value, expressible as two hex digits.
精彩评论