Get Mechanize to handle cookies from an arbitrary POST (to log into a website programmatically)
I want to log into https://www.t-mobile.com/ programmatically. My first idea was to use Mechanize to submit the login form:
alt text http://dl.dropbox.com/u/2792776/screenshots/2010-04-08_1440.png
However, it turns out that this isn't even a real form. Instead, when you click "Log in" some javascript grabs the values of the fields, creates a new form dynamically, and submits it.
"Log in" button HTML:
<button onclick="handleLogin(); return false;" class="btnBlue" id="myTMobile-login"><span>Log in</span></button>
The handleLogin()
function:
function handleLogin() {
if (ValidateMsisdnPassword()) { // client-side form validation logic
var a = document.createElement("FORM");
a.name = "form1";
a.method = "POST";
a.action = mytmoUrl; // defined elsewhere as https://my.t-mobile.com/Login/LoginController.aspx
var c = document.createElement("INPUT");
c.type = "HIDDEN";
c.value = document.getElementById("myTMobile-phone").value; // the value of the phone number input field
c.name = "txtMSISDN";
a.appendChild(c);
var b = document.createElement("INPUT");
b.type = "HIDDEN";
b.value = document.getElementById("myTMobile-password").value; // the value of the password input field
b.name = "txtPassword";
a.appendChild(b);
document.body.appendChild(a);
a.submit();
return true
} else {
return false
}
}
I could simulate this开发者_如何学运维 form submission by POSTing the form data to https://my.t-mobile.com/Login/LoginController.aspx
with Net::HTTP#post_form
, but I don't know how to get the resultant cookie into Mechanize so I can continue to scrape the UI available when I'm logged in.
Any ideas?
You can use something like this to login and save the cookie so you won't have to do it again. Of course you will need to come up with your own logic to post it directly but this is how I use Mechanize's built in cookie_jar method to save cookies.
if !agent.cookie_jar.load('cookies.yml')
page = agent.get('http://site.com')
form = page.forms.last
form.email = 'email'
form.password = 'password'
page = agent.submit(form)
agent.cookie_jar.save_as('cookies.yml')
end
I would avoid Net::HTTP; try with:
post(url, query={}, headers={})
directly from Mechanize class.
I often use the FireFox HttpFox extension to figure out what exactly is going on for these kind of problems.
精彩评论