开发者

Python: web login script, what's the problem?

this is the script >>

import ClientForm
import urllib2
request = urllib2.Request("http://ritaj.birzeit.edu")

response = urllib2.urlopen(request)
forms = ClientForm.ParseResponse(response, backwards_compat=False)
response.close()

form = forms[0]
print form
sooform = str开发者_开发问答(raw_input("Form Name: "))
username = str(raw_input("Username: "))
password = str(raw_input("Password: "))

form[sooform] = [username, password]

request2 = form.click()
try:
    response2 = urllib2.urlopen(request2)
except urllib2.HTTPError, response2:
    pass

print response2.geturl()
print response2.info()  # headers
print response2.read()  # body
response2.close()

when start the script ,, i got this

Traceback (most recent call last):
  File "C:/Python26/ritaj2.py", line 9, in <module>
    form = forms[0]
IndexError: list index out of range

what is th problem,, i running on windows, python 2.6.4

Update:

I want a script that login this site, and print the response :)


The only <form> tag in the HTML served at that URL (save it to a file and look for yourself!) is:

<form method="GET" action="http://www.google.com/u/ritaj">

which does a customized Google search and has nothing to do with logging in (plus, for some reason, ClientForm has some problem identifying that specific form -- but that form is no use to you anyway, so I didn't explore that issue further).

You can still get at the controls in the page by using

forms = ClientForms.ParseResponseEx(response)

which makes forms[0] an artificial one containing all controls that aren't within a form.

Specifically, this approach identifies controls with the following names, in order (again there's a bit of parsing confusion here, but hopefully not a killer for you...):

>>> f = forms[0]
>>> [c.name for c in f.controls]
['q', 'sitesearch', 'sa', 'domains', 'form:mode', 'form:id', '__confirmed_p', '__refreshing_p', 'return_url', 'time', 'token_id', 'hash', 'username', 'password', 'persistent_p', 'formbutton:ok']

so you should be able to set the username and password controls of the "non-form form" f, and proceed from there.

(A side bit: raw_input already returns a string, lose those redundant str() calls around it).


the actual address seems to be using https instead of http. check the urllib2 doc to see if it handles HTTPS( i believe you need ssl)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜