Problem with converting string encoded in iso format to another string encoded in utf-8
Firstable I want to say that I was trying to google that problem and search for the anwser on stackoverflow and I know that Java stores String as UTF-16. I have a problem with converting a String that was encoded in ISO format to UTF-8. The website I'm downloading displays chars in ISO and the rest of my program which also transfo开发者_开发问答rm strings into stream uses UTF-8 encoding.
How can I change encoding of my inputHTML string to UTF-8? I was trying to manipulate it using Writer:
OutputStream os = new ByteArrayOutputStream();
Writer wr = new OutputStreamWriter(os, "UTF-8");
Writer writer = new BufferedWriter(wr);
writer.write(inputHTML);
writer.close();
but don't know how to change OutputStream to my converted new String. This is my code:
URL url = new URL("http://www.onet.pl");
InputStream is = url.openStream();
Reader reader = new InputStreamReader(is, "ISO-8859-2");
StringWriter writer = new StringWriter();
char[] buf = new char[4096];
int len;
while ((len = reader.read(buf)) >= 0)
writer.write(buf, 0, len);
StringBuffer sb = writer.getBuffer();
String inputHTML = new String(sb);
You don't. You write it to a writer initialized with the appropriate encoding, and the writer will convert it upon writing it out.
精彩评论