The following code class SiteSpider(BaseSpider): name = \"some_site.com\" allowed_domains = [\"some_site.com\"]
This question already has answers here: What are the pros and cons of the leading Java HTML parsers? [closed]
I\'d like to create a mobile site for a website where my highschool checks our grades. Effectively, I want to remotely login to the portal, and then go into the sub page where the students grades are
i am using htmlunit to try to open a site but I keep getting 404 errors.The site works in my python scripts and in my browser but not in html unit for some reason.I think my URL itself is fine but it
I am trying to login into www.diary.com using a httpwebrequest object. However, it always fail to login, and kept giving me back the login page. Can anyone enlighten me on what is/are wrong?
What\'s a good was to scrape website content using Node.js. I\'d like to build something very, ver开发者_Go百科y fast that can execute searches in the style of kayak.com, where one query is dispatched
I am navigating a site using python\'s mechanize module and having trouble clicking on a javascript link for next page.I did a bit of reading and people suggested I need python-spidermonkey and DOMfor
I want to scrape my associate data but I cant log into: https://affiliate-program.开发者_JAVA百科amazon.com/gp/associates/login/login.html
I\'m trying to run a simple screen scraping app in node.js. The code is posted here: https://github.com/anismiles/jsdom-based-screen-scraper
I have a website where I need to login with username and password and captcha. Once in I have a control panel开发者_Go百科 that has bookings. For each booking there is a link for a details page that