Help displaying java characters in different languages
I'm sending a request to a web service, and the response that I receive could be in any language: English, French, Arabic, Japanese, etc.
I'm having a problem displaying the different languages correctly, however. For example, I am receiving:
translation: ä½ å¥½
Instead of:
translation: 你好
I'm guessing that I'm not encodi开发者_如何学Pythonng correctly in my HTTP Request/Response. Can someone tell me what I may be doing wrong? Here is the code where I receive the HTTP response:
baos = new ByteArrayOutputStream();
InputStream responseData = connection.openInputStream();
byte[] buffer = new byte[20000];
int bytesRead = 0;
while ((bytesRead = responseData.read(buffer)) > 0) {
baos.write(buffer, 0, bytesRead);
}
System.out.println(new String(baos.toByteArray()));
Thanks!
In the end when you print, try
System.out.println(new String(baos.toByteArray(), Charset.forName("UTF-8")));
new String(baos.toByteArray());
is going to interpret the byte[] with your platform's default character set. From the documentation:
Constructs a new String by decoding the specified array of bytes using the platform's default charset.
Those bytes need to be interpreted by a decoder which is compatible with the character set the server is sending you. Often this is specified in the HTTP Content-type
header.
I think You should use the method toString(String charsetName)
of ByteArrayOutputStream
.
Something like this:
System.out.println(baos.toString("UTF-8"));
Of course you have to use the same encoding on both the server and the client.
I hope it helps.
I think you need to use InputStreamReader, and OutputStreamWriter. Those classes let you specify the encoding. For example:
Writer out = new BufferedWriter(new OutputStreamWriter(System.out, "UTF-8"));
out.write(something);
in
System.out.println(new String(baos.toByteArray()))
you need to supply the correct Charset to new String(byte bytes[], Charset charset)
.
You need to know which Charset to use (i.e. it should be sent with the response). Default is UTF-8, but this does not cover Chinese, Japanese, Arabic, Hebrew etc.
精彩评论