Get the contents after logging In in Java
I want to login to a site(yahoo mail - https://login.yahoo.com/config/login?.src=fpctx&.intl=us&.done=http%3A%2F%2Fwww.yahoo.com%2F)
using HttpClient and after logging in I want retrieve the contents. (java). what's wrong with my code?
public class TestHttpClient {
public static void main(String[] args) throws Exception {
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://www.yahoo.com/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
System.out.println("Login form get: " + response.getStatusLine());
if (entity != null) {
entity.consumeContent();
}
System.out.println("Initial set of cookies:");
List<Cookie> cookies = httpclient.getCookieStore().getCookies();
if (cookies.isEmpty()) {
System.out.println("None");
} else {
for (int i = 0; i < cookies.size(); i++) {
System.out.println("- " + cookies.get(i).toString());
}
}
HttpPost httpost = new HttpPost("https://login.yahoo.com/config/login_verify2?.intl=us&.src=ym");
List <NameValuePair> nvps = new ArrayList <NameValuePair>();
nvps.add(new BasicNameValuePair("IDToken1", "Yahoo! ID"));
nvps.add(new BasicNameValuePair("IDToken2", "Password"));
httpost.setEntity(new UrlEncodedFormEntity(nvps, HTTP.UTF_8));
response = httpclient.execute(httpost);
System.out.println("Response "+response开发者_高级运维.toString());
entity = response.getEntity();
System.out.println("Login form get: " + response.getStatusLine());
if (entity != null) {
InputStream is = entity.getContent();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String str ="";
while ((str = br.readLine()) != null){
System.out.println(""+str);
}
}
System.out.println("Post logon cookies:");
cookies = httpclient.getCookieStore().getCookies();
if (cookies.isEmpty()) {
System.out.println("None");
} else {
for (int i = 0; i < cookies.size(); i++) {
System.out.println("- " + cookies.get(i).toString());
}
}
httpclient.getConnectionManager().shutdown();
}
}
when I print the output from HttpEntity it's printing the login page contents. How do I get the contents of the page after I login using HttpClient?
If you see the yahoo login source page, you'll see that there are many other parameters you are not sending in your request.
<input type="hidden" name=".tries" value="1">
<input type="hidden" name=".src" value="fpctx">
<input type="hidden" name=".md5" value="">
<input type="hidden" name=".hash" value="">
<input type="hidden" name=".js" value="">
<input type="hidden" name=".last" value="">
<input type="hidden" name="promo" value="">
<input type="hidden" name=".intl" value="us">
<input type="hidden" name=".bypass" value="">
<input type="hidden" name=".partner" value="">
<input type="hidden" name=".u" value="a0bljsd77uima">
<input type="hidden" name=".v" value="0">
<input type="hidden" name=".challenge" value="sCm6Z8Bv1vy78LBlEd8dnFsmbit1">
<input type="hidden" name=".yplus" value="">
...
I suppose that is the reason why Yahoo understands the login has failed and sends you to the login page again. That login page is what you see as the response.
Many sites try to avoid programmatic logins (to avoid bots or other security issues), so it could be hard to do what you are trying. You could:
- Use official Yahoo public APIs, when possible.
- Try to use other Java libraries that simulates the user browsing (such as HTTPUnit or HtmlUnit, there are many others) and "fake" the user as if he was navigating Yahoo pages.
精彩评论