Google translate v2 api returning non UTF-8 characters
I am trying to use the Google Translate v2 api in my app engine project. However, for accented characters, its encoding is messed up [case in point being the word "student", which should be "étudiants" in French, becomes "étudiants"]. Here is my code.
URL url = new URL(
"https://www.googleapis.com/language/translate/v2?key=" + KEY
+ "&q=" + urlEncodedText + "&source=en&target="
+ urlEncodedLang);
try {
InputStream googleStream = url.openStream();
// make a new bufferred reader, by reading the page at the URL given
// above
BufferedReader reader = new BufferedReader(new InputStreamReader(
googleStream));
// temp string that holds text line by line
String line;
// read the contents of the reader/the page by line, until there are
// no lines left
while ((line = reader.readLine()) != null) {
// keep adding each line to totalText
totalText = totalText + line + "\n";
}
// remember to always close the reader
re开发者_Go百科ader.close();
} catch (Exception ex) {
ex.printStackTrace();
}
typing the same URL in a browser (Chrome on Ubuntu) works fine, and returns JSON response containing the properly accented characters.
What am I missing here? Thanks
To make sure that it has UTF-8 encoding, you have to use:
BufferedReader reader = new BufferedReader(new InputStreamReader(googleStream, "UTF-8"));
in other case it's using an default encoding, probably it's a ISO-8859-1
.
You can also try using Google Translate API v2 for Java that does it for you.
精彩评论