Connect to website and keep session?
I'm trying to connect to a website (source code below) that requires login and then browse it to download some files. I've managed to do this for another website using this code:
public void initConnection(String _path, Map<String,String> _parameters) throws IOException {
String data = convertMapToParams(_parameters);
// Send data
URL url = new URL(host + "/" + _path);
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
wr.close();
sessionCookie = conn.getHeaderField("Set-Cookie");
sessionCookie = sessionCookie.substring(0,sessionCookie.indexOf(";"));
}
public List<String> getHtml(String _path, Map<String, String> _parameters) throws IOException {
String data = convertMapToParams(_parameters);
URL url = new URL(host + "/" + _path);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setDoOutput(true);
conn.setRequestProperty("Cookie", sessionCookie);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
wr.close();
List<String> list = new LinkedList<String>();
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = rd.readLine()) != null) {
list.add(line);
}
rd.close();
return list;
}
The problem is that on this website, when I do this:
sessionCookie = conn.getHeaderField("Set-Cookie");
I get sessionCookie == "null", so I am not able to get any cookies to keep the session opened. And if I get the headers from the conn variable to check if there is any cookie field in there I get this (from IntelliJ IDEA debugger):
[0] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2085}"null=[HTTP/1.1 200 OK]"
[1] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2093}"X-AspNet-Version=[2.0.50727]"
[2] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2102}"Date=[Wed, 18 Aug 2010 07:32:37 GMT]"
[3] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2111}"Content-Length=[3686]"
[4] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2120}"Content-Type=[text/html; charset=utf-8]"
[5] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2129}"Server=[Microsoft-IIS/6.0]"
[6] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2138}"X-Powered-By=[ASP.NET]"
[7] = {java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntry@2147}"Cache-Control=[private]"
But using Firefox add-on "HttpFox" to check if there are cookies, I discovered that there are:
(Request-Line) POST /companias/entrada.aspx HTTP/1.1
User-Agent Mozilla/5.0 (Windows; U; Windows NT 6.1; es-ES; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/;q=0.8
Accept-Language es-es,es;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding gzip,deflate
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive 115
Connection keep-alive
Cookie __utma=235757843.1141928071.1280949246.1282083861.1282114987.11; __utmz=235757843.1280949246.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmc=235757843
Content-Type application/x-www-form-urlencoded
Content-Length 381
Another thing that confused me were these fields in the sourcecode "__VIEWSTATE", "__EVENTVALIDATION", "__EVENTTARGET","__LASTFOCUS" and "__EVENTARGUMENT". Because I've been searching for information about them and If I understood it right, you can use VIEWSTATE to control the session of the user but I don't know how it works.
So, to put it short, on another website I used that simple "getheaderField("Set-Cookie")" to get the cookie and keep the session alive, but now I don't know if the website uses cookies or if it doesn't and also I don't know if cookies would be the way to go or if instead I have to use this VIEWSTATE field to do so.
I am not very experienced with Java yet and less with connection things, I was recommended here to use Apache HttpClient for this things and I'm reading about it, but I have so many mixed things right now that I'd first need to know the way to go with this site.
And finally, this is part of the source code from this website:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head><title>
Steps Peritaciones S.L.
</title><link href="../Styles/general.css" rel="stylesheet" type="text/css" />
<style type="text/css">
</style>
</head>
<body>
<form name="form1" method="post" action="entrada.aspx" onsubmit="javascript:return WebForm_OnSubmit();" id="form1">
<div>
<input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" />
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input 开发者_Python百科type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUJODcxMzI1MDYzZBgBBR5fX0NvbnRyb2xzUmVxdWlyZVBvc3RCYWNrS2V5X18WAQUbTG9naW5TdGVwcyRMb2dpbkltYWdlQnV0dG9udl7bDlN22j9J5Z5UXZi+FLbU6hk=" />
</div>
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['form1'];
if (!theForm) {
theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
<div>
<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="/wEWBAKZ/NOFAgK6jd26DgKovcvMBwKV8YLlBGCk0AytR6jZVZxOJwJ59H/uIN21" />
</div>
<div class="logoEntrada">
<img src="../images/logo_steps_p.gif" alt="Steps Peritaciones S.L." />
</div>
<div class="LoginForm" >
<br />
<br />
<span id="Label1" class="TitolEntrada">Acceso Compañias</span>
<br />
</div>
<div class="LoginForm">
<center>
<table class="LoginBox" cellspacing="0" cellpadding="4" border="0" id="LoginSteps" style="background-color:#E3EAEB;border-color:#E6E2D8;border-width:1px;border-style:Solid;border-collapse:collapse;">
<tr>
<td><table cellpadding="0" border="0" style="color:#333333;font-family:Verdana;font-size:1em;width:234px;">
<tr>
<td align="center" style="color:White;background-color:#1C5E55;font-size:1em;font-weight:bold;">Entrada</td>
</tr><tr>
<td><label for="LoginSteps_UserName">Usuario:</label></td>
</tr><tr>
<td><input name="LoginSteps$UserName" type="text" id="LoginSteps_UserName" style="font-size:1em;width:171px;" /><span id="LoginSteps_UserNameRequired" title="El nombre de usuario es obligatorio." style="color:Red;visibility:hidden;">*</span></td>
</tr><tr>
<td><label for="LoginSteps_Password">Contraseña:</label></td>
</tr><tr>
<td><input name="LoginSteps$Password" type="password" id="LoginSteps_Password" style="font-size:1em;width:171px;" /><span id="LoginSteps_PasswordRequired" title="La contraseña es obligatoria." style="color:Red;visibility:hidden;">*</span></td>
</tr><tr>
<td align="right"><input type="submit" name="LoginSteps$LoginButton" value="Entrar" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("LoginSteps$LoginButton", "", true, "LoginSteps", "", false, false))" id="LoginSteps_LoginButton" style="color:#1C5E55;background-color:White;border-color:#C5BBAF;border-width:1px;border-style:Solid;font-family:Verdana;font-size:1em;" /></td>
</tr>
</table></td>
</tr>
</table>
</center>
</div>
<script type="text/javascript">
//<![CDATA[
var LoginSteps_UserNameRequired = document.all ? document.all["LoginSteps_UserNameRequired"] : document.getElementById("LoginSteps_UserNameRequired");
LoginSteps_UserNameRequired.controltovalidate = "LoginSteps_UserName";
LoginSteps_UserNameRequired.errormessage = "El nombre de usuario es obligatorio.";
LoginSteps_UserNameRequired.validationGroup = "LoginSteps";
LoginSteps_UserNameRequired.evaluationfunction = "RequiredFieldValidatorEvaluateIsValid";
LoginSteps_UserNameRequired.initialvalue = "";
var LoginSteps_PasswordRequired = document.all ? document.all["LoginSteps_PasswordRequired"] : document.getElementById("LoginSteps_PasswordRequired");
LoginSteps_PasswordRequired.controltovalidate = "LoginSteps_Password";
LoginSteps_PasswordRequired.errormessage = "La contraseña es obligatoria.";
LoginSteps_PasswordRequired.validationGroup = "LoginSteps";
LoginSteps_PasswordRequired.evaluationfunction = "RequiredFieldValidatorEvaluateIsValid";
LoginSteps_PasswordRequired.initialvalue = "";
//]]>
</script>
<script type="text/javascript">
//<![CDATA[
var Page_ValidationActive = false;
if (typeof(ValidatorOnLoad) == "function") {
ValidatorOnLoad();
}
function ValidatorOnSubmit() {
if (Page_ValidationActive) {
return ValidatorCommonOnSubmit();
}
else {
return true;
}
}
WebForm_AutoFocus('LoginSteps');Sys.Application.initialize();
document.getElementById('LoginSteps_UserNameRequired').dispose = function() {
Array.remove(Page_Validators, document.getElementById('LoginSteps_UserNameRequired'));
}
document.getElementById('LoginSteps_PasswordRequired').dispose = function() {
Array.remove(Page_Validators, document.getElementById('LoginSteps_PasswordRequired'));
}
//]]>
Thanks and I hope it's not too much code in one post :S
P.D.: This websites belong to my job and I have authorized access to them, so it's not any hacking thing, I just want to automate the process and learn while I'm on it
OMG, you do it all by hand ? I would really suggest you instead use HtmlUnit, as it allows you to use a virtual web client, with all its capabilities, and a higher level API allowing you to focus on website interaction, instead of opening streams by hand.
Alternatively you can use HttpClient
Here is tutorial for same:
http://hc.apache.org/httpcomponents-client-4.0.1/tutorial/html/
Check following related to cookies (state management):
http://hc.apache.org/httpcomponents-client-4.0.1/tutorial/html/statemgmt.html
精彩评论