Python: web login script, what's the problem?
this is the script >>
import ClientForm
import urllib2
request = urllib2.Request("http://ritaj.birzeit.edu")
response = urllib2.urlopen(request)
forms = ClientForm.ParseResponse(response, backwards_compat=False)
response.close()
form = forms[0]
print form
sooform = str开发者_开发问答(raw_input("Form Name: "))
username = str(raw_input("Username: "))
password = str(raw_input("Password: "))
form[sooform] = [username, password]
request2 = form.click()
try:
response2 = urllib2.urlopen(request2)
except urllib2.HTTPError, response2:
pass
print response2.geturl()
print response2.info() # headers
print response2.read() # body
response2.close()
when start the script ,, i got this
Traceback (most recent call last):
File "C:/Python26/ritaj2.py", line 9, in <module>
form = forms[0]
IndexError: list index out of range
what is th problem,, i running on windows, python 2.6.4
Update:
I want a script that login this site, and print the response :)
The only <form>
tag in the HTML served at that URL (save it to a file and look for yourself!) is:
<form method="GET" action="http://www.google.com/u/ritaj">
which does a customized Google search and has nothing to do with logging in (plus, for some reason, ClientForm has some problem identifying that specific form -- but that form is no use to you anyway, so I didn't explore that issue further).
You can still get at the controls in the page by using
forms = ClientForms.ParseResponseEx(response)
which makes forms[0]
an artificial one containing all controls that aren't within a form.
Specifically, this approach identifies controls with the following names, in order (again there's a bit of parsing confusion here, but hopefully not a killer for you...):
>>> f = forms[0]
>>> [c.name for c in f.controls]
['q', 'sitesearch', 'sa', 'domains', 'form:mode', 'form:id', '__confirmed_p', '__refreshing_p', 'return_url', 'time', 'token_id', 'hash', 'username', 'password', 'persistent_p', 'formbutton:ok']
so you should be able to set the username
and password
controls of the "non-form form" f
, and proceed from there.
(A side bit: raw_input
already returns a string, lose those redundant str()
calls around it).
the actual address seems to be using https
instead of http
. check the urllib2 doc to see if it handles HTTPS( i believe you need ssl)
精彩评论