开发者

html/javascript automaticly getting link from submit button (maybe automating with python?)

I have a website where I have to click a submit button on a form. This gives me a link. I know the link is made up with the paramter that is passed through a hidden value. I was wondering if I could make a python script or something els开发者_JS百科e that would go to the website and click some buttons returning the link that the submit button generates, if so how could I pass the extra parameter that influences the creation of the link?

thanks in advance.


If it is python you're looking for then give the Mechanize library a shot. If you are just extracting small but unique elements of the HTML document then you may as well use regex's with python. To work with the HTML document more pragmatically then BeautifulSoup may be more beneficial, which you can combine with mechanize/python.

It's more straightforward than it may initially seem.


Take a look at this documentation of the urlib2 package. Below is the code you would use, but the documentation explains (very well) what is happening.

Excerpt:

import urllib
import urllib2

url = 'http://www.someserver.com/cgi-bin/register.cgi'
values = {'name' : 'Michael Foord',
          'location' : 'Northampton',
          'language' : 'Python' }

data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()

You would need to use an HTML parser like BeautifulSoup to obtain the parameter name and value that is being posted when you click the button.

Edit:

Yes, you could use mechanize for this as well. You would do something like this (untested):

from mechanize import Browser

br = Browser()
br.open("http://www.example.com/")  # this would be your website
br.select_form(name="order")        # change this to the name of your form
response = br.submit()              # submits the form, just like if you clicked the submit button
print response.geturl()             # prints the URL you are looking for

You'll need to make this specific to your website/form, but something along these lines should do the trick.

Check out the examples/documentation for the ClientForm object if you find you need more control.


Once you have downloaded your html data with mechanize as other users said, then you could use Beautifulsoup like this:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html_data)
hidden_tag = soup.find('input',name='hiddenId',type='hidden') 
hidden_value = hiddenId['value']

then you could forge a POST with urllib2 like this:

import urllib
import urllib2
url = 'http://yoursite.com'
values = {'yourhiddenname' : hidden_value}
request = urllib2.Request(url, urllib.urlencode(values))
response = urllib2.urlopen(request)
result = response.read()
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜