开发者

Problem with converting string encoded in iso format to another string encoded in utf-8

Firstable I want to say that I was trying to google that problem and search for the anwser on stackoverflow and I know that Java stores String as UTF-16. I have a problem with converting a String that was encoded in ISO format to UTF-8. The website I'm downloading displays chars in ISO and the rest of my program which also transfo开发者_开发问答rm strings into stream uses UTF-8 encoding.

How can I change encoding of my inputHTML string to UTF-8? I was trying to manipulate it using Writer:

OutputStream os = new ByteArrayOutputStream();
Writer wr = new OutputStreamWriter(os, "UTF-8");
Writer writer = new BufferedWriter(wr);
writer.write(inputHTML);
writer.close();

but don't know how to change OutputStream to my converted new String. This is my code:

    URL url = new URL("http://www.onet.pl");
    InputStream is = url.openStream();

    Reader reader = new InputStreamReader(is, "ISO-8859-2");

    StringWriter writer = new StringWriter();
    char[] buf = new char[4096];
    int len;
    while ((len = reader.read(buf)) >= 0)
            writer.write(buf, 0, len);

            StringBuffer sb = writer.getBuffer();
            String inputHTML = new String(sb);


You don't. You write it to a writer initialized with the appropriate encoding, and the writer will convert it upon writing it out.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜