Select radio buttons with scrapy
How would i go about selection radio buttons with scrapy?
I am trying to select the following
formdata={'rd1':'E'} does not work
<input type="radio" name="rd1" value="E" checked="checked" />Employee
<input type="radio" name="rd2" value="开发者_JAVA百科o" />Other
You could use lxml.cssselector to select the radio buttons.
>>> import lxml.html
>>> from lxml.cssselect import CSSSelector
>>> str = """
... '<input type="radio" name="rd1" value="E" checked="checked" />Employee
... <input type="radio" name="rd2" value="o" />Other'
... """
>>> input_sel = CSSSelector('input[name="rd1"]')
>>> lx = lxml.html.fromstring(str)
>>> input_sel(lx)
[<InputElement b7e7665c name='rd1' type='radio'>]
I've just bumped into a similar problem (that's why I'm here of course). This wonderful site of the City of Chicago (https://webapps1.chicago.gov/buildingrecords/home) requires your bot to 'Aggree' to their 'liability disclaimer' (this is very funny indeed!) with a radio-button and a click on a button. I solved the problem with the help of scrapy.FormRequest.from_response
:
def agreement_failed(response):
# check the result of your first post here
return # something if it's a failure or nothing if it's not
class InspectionsListSpider(scrapy.Spider):
name = 'inspections_list'
start_urls = ['https://webapps1.chicago.gov/buildingrecords/home']
def parse(self, response):
return scrapy.FormRequest.from_response(
response,
formid='agreement',
formdata = {"agreement": "Y",
"submit": "submit"},
callback = self.after_agreement
)
def after_agreement(self, response):
if agreement_failed(response):
self.logger.error("agreement failed!")
return
else:
... # whatever you are going to do after
Together with the code of the page it's pretty self-explanatory. You may also need other parameters of your form described here: https://docs.scrapy.org/en/latest/topics/request-response.html?highlight=FormRequest()#scrapy.http.FormRequest.from_response
P.S. The next pages' riddle is solvable too in the same way. :)
精彩评论