开发者

Help displaying java characters in different languages

I'm sending a request to a web service, and the response that I receive could be in any language: English, French, Arabic, Japanese, etc.

I'm having a problem displaying the different languages correctly, however. For example, I am receiving:

translation: ä½ å¥½

Instead of:

translation: 你好

I'm guessing that I'm not encodi开发者_如何学Pythonng correctly in my HTTP Request/Response. Can someone tell me what I may be doing wrong? Here is the code where I receive the HTTP response:

        baos = new ByteArrayOutputStream();

        InputStream responseData = connection.openInputStream();
        byte[] buffer = new byte[20000];
        int bytesRead = 0;
        while ((bytesRead = responseData.read(buffer)) > 0) {
            baos.write(buffer, 0, bytesRead);
        }
        System.out.println(new String(baos.toByteArray()));

Thanks!


In the end when you print, try

System.out.println(new String(baos.toByteArray(), Charset.forName("UTF-8")));


new String(baos.toByteArray());

is going to interpret the byte[] with your platform's default character set. From the documentation:

Constructs a new String by decoding the specified array of bytes using the platform's default charset.

Those bytes need to be interpreted by a decoder which is compatible with the character set the server is sending you. Often this is specified in the HTTP Content-type header.


I think You should use the method toString(String charsetName) of ByteArrayOutputStream.

Something like this:

System.out.println(baos.toString("UTF-8"));

Of course you have to use the same encoding on both the server and the client.

I hope it helps.


I think you need to use InputStreamReader, and OutputStreamWriter. Those classes let you specify the encoding. For example:

Writer out   = new BufferedWriter(new OutputStreamWriter(System.out, "UTF-8"));
out.write(something);


in

System.out.println(new String(baos.toByteArray()))

you need to supply the correct Charset to new String(byte bytes[], Charset charset).

You need to know which Charset to use (i.e. it should be sent with the response). Default is UTF-8, but this does not cover Chinese, Japanese, Arabic, Hebrew etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜