开发者

Java linux character encoding issue

I'开发者_运维问答m facing an issue with character encoding in linux. I'm retrieving a content from amazon S3, which was saved using UTF-8 encoding. The content is in Chinese and I'm able to see the content correctly in the browser.

I'm using amazon SDK to retrieve the content and do some update to it.Here's the code I'm using:


StringBuilder builder = new StringBuilder();
S3Object object = client.getObject(new GetObjectRequest(bucketName, key));
        BufferedReader reader = new BufferedReader(new 
                InputStreamReader(object.getObjectContent(), "utf-8"));
while (true) {
    String line = reader.readLine();
    if (line == null) 
        break;
    builder.append(line);
}

This piece of code works fine in Windows environment as I was able to update the content and save it back without messing up any chinese characters in it.

But, its acting differently in linux enviroment. The code is unable to translate the characters properly, the chinese characters are rendered as ???

I'm not sure what's going wrong here. Any pointers will be appreciated.

-Thanks


The default charset is different for the 2 OS's your using.

To start off, you can confirm the difference by printing out the default charset.

Charset.defaultCharset.name()

Somewhere in your code, I think this default charset is being used for some String conversion. The correct procedure should be to track that down, and specify UTF-8.

Without seeing that code, I can only suggest the 'cheating' way to do it: set the default charset explicitly, near the beginning of your code, or at Java startup. See here for changing default charset: Setting the default Java character encoding?

HTH

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜