Java URL - Google Translate request returns 403 error? [duplicate]
I'm making a Java console application that needs to send an HTTP request to Google Translate to get a translation from the aforementioned site.
My problem, is that I receive a 403 error when I try to read from a valid URL, using openStream()
.
Creating an instance of this Translator class with Translator t = new Translator();
and calling t.translate("en", "ja", "cheese");
, for example, should return the translation the program finds on the page http://translate.google.com/#en|ja|cheese, it seems, but instead it catches an IOException and returns this:
http://translate.google.com/#en|ja|cheese
Server returned HTTP response code: 403 for URL: http://translate.google.com/#en|ja|cheese
A similar error occurs with any other arguments that create a valid Google Translate URL.
A 403 error apparently means I am denied permission. This is what I want to know about. Why can't I access this page, and what must I do in order to access it?
I've visited the site in my web browser, and entered the address that my program tries to access, manually, but it worked; I'm not sure why my program thus cannot access the page? Typing or copy/pasting the address into my FireFox navigation bar works; see, if this is correct, then the site may be wanting me to access the page via links on another page? How might I go about that, if that's what I must do?
Here's the code, as I think it may help.. the exception seems to be thrown when I try to create a BufferedReader from an InputStreamReader from the InputStream returned by translationURL.openStream()
:
import java.io.*;
import java.net.*;
public class Translator {
private final String googleTranslate = "http://translate.google.com/#";
public String translate( String from, Str开发者_如何学JAVAing to, String item ) {
String translation = googleTranslate + from + '|' + to + '|' + item;
URL translationURL;
try { translationURL = new URL(translation); }
catch(MalformedURLException e) { return e.getMessage(); }
BufferedReader httpin;
String fullPage = "";
System.out.println(translation);
try {
httpin = new BufferedReader(
new InputStreamReader(translationURL.openStream()));
String line;
while((line=httpin.readLine()) != null) { fullPage += line + '\n'; }
httpin.close();
} catch(IOException e) { return e.getMessage(); }
int begin = fullPage.indexOf("<span class=\"\">");
int end = fullPage.indexOf("</span>");
return fullPage.substring(begin + 15, end);
}
public Translator() {}
}
I am testing this code in Eclipse (GALILEO) on Ubuntu Linux 11.04, installed with Wubi, with a working and reliable wireless Internet connection. I've also tried running it in the command line, but the behavior was the same. java -version
got me this:
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.2) (6b22-1.10.2-0ubuntu1~11.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
They are looking at the user agent string, and presumably they don't want people doing this programatically.
I did get your code working working, but since Google charges for the API access and they are actively blocking things that are not browsers (based on the user agent string) I won't tell you how I did it.
A google search for setting the user agent string in Java will get you what you want (as a matter of fact I found the answer here on Stackoverflow).
deceiving "translate.google" by telling that I'm a browser not a running code.
URLConnection conn = url.openConnection();
// fake request coming from browser
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
Add referer to request like
URL translateURL = new URL(url);
HttpURLConnection connection = (HttpURLConnection) translateURL
.openConnection();
connection.setDoOutput(true);
connection.setRequestProperty("X-HTTP-Method-Override", "GET");
connection.setRequestProperty("referer", "accounterlive.com");
精彩评论