开发者

Encoding problem between jQuery and Java

My encoding is set to ISO-8859-1.

I'm making an AJAX call using jQuery.ajax to a servlet. The URL (after it has been serialized by jQuery) ends up looking like this:

https://myurl.com/cou开发者_StackOverflow社区ntryAndProvinceCodeServlet?action=getProvinces&label=%C3%85land+Islands

The actual label value is Åland Islands. When this comes to the servlet, the value that I receive is:

Ã\u0085land Islands

But this is not what I want. I'd like it to get decoded to Åland Islands. I've tried many things (setting scriptCharset, trying to convert the string using getBytes(), but nothing seems to work).


It is an unfortunate part of the Servlet specification that the encoding used to decode query parameters is not settable by servlets themselves. Instead it is left as a configuration matter for the server.

This makes deployment of internationalised web sites an enormous pain, especially because the default encoding chosen by the Servlet spec is not the most-likely-to-be-useful UTF-8, but ISO-8859-1. (Actual ISO-8859-1, not even Windows code page 1252, which is the encoding browsers will really submit when told to use ISO-8859-1!)

So how to reconfigure this is a server problem. For Tomcat, it requires some fiddling with the server.xml.

The alternative approach, if you don't have access to the server config, is to take each submitted parameter name/value and re-encode them. Luckily ISO-8859-1 preserves every byte submitted as a Unicode code point of the same number, so to convert the string as if it had been interpreted properly as UTF-8 in the first place, you can simply encode each String to a byte array using ISO-8859-1, and then decode the bytes back to a String using UTF-8. Of course if someone then re-configures the server to use UTF-8 you've got a problem...


Bobince already went into detail, so I'll skip that part. If you have really no control over the container managed URI encoding, your best bet is to take the URI encoding in your own hands. You can obtain the raw GET query string in servlets by HttpServletRequest#getQueryString(). Then it's a matter to split and URL-decode them using UTF-8 yourself using the usual String methods and URLDecoder#decode().

for (String parameter : request.getQueryString().split("&")) {
    String[] pair = parameter.split("=");
    String name = URLDecoder.decode(pair[0], "UTF-8");
    String value = URLDecoder.decode(pair[1], "UTF-8");
    // ...
}

Needless to say, keep in mind that this isn't a solution, but a workaround.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜