开发者

How to encode a URL with accent characters in Javascript?

I am having a strange issue with IE with the URL where if I pass scenario in the URL it does not work and it seems to work perfectly for other browsers.

IN IE the URL comes up as :

.../search.aspx?keyword=开发者_Python百科sc%c3%83%c2%a9nario

In FireFox URL comes up as :

.../search.aspx?keyword=sc%C3%A9nario

In IE the URL breaks and Firefox works fine, do i have to do URL decoding to take care of this in IE ?


Although you did not say how you "passed" the string, I can tell you what happened.

The character é has codepoint E9. In UTF-8, this is encoded as two bytes: C3 A9. So the correct way to show "scénario" in a URL is

sc%C3%A9nario

Now, what would happen if you took this string and looked at each individual byte and assumed they were characters, and did a UTF-8 encoding a second time? What would happen is

  • s -> s
  • c -> c
  • %C3 would be interpreted as the char with codepoint C3, namely Ã, which in UTF-8 is C3 83.
  • %A9 would be interpreted as the char with codepoint A9, namely ©, which in UTF-8 is C2 A9.
  • n -> n
  • etc.

This corresponds exactly with what you saw in IE.

Now I can't say for sure how this happened because there was not enough background in the question, but what is clear is that somehow, the string "scénario" got encoded into a UTF-8 byte string and then it got encoded again based on the weird assumption that the first encoding produced a character string in the Windows-1252 or Latin-1 encoding.

You need to look into how your string got "encoded twice."

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜