How to scrape website with login required - example.com [closed]
I try to screen scrape a website using .net (Webclient, webrequest,response etc.) I've tried many methods but nothing seems to work.
I always开发者_如何学运维 get "Please login to see this content!" site instead of full auction info : http://www.example.com/en/auctions/auto-details/107891/
i am sending login data with post method
Please help
Its because when you view it through your browser the authentication cookie is being sent to squiddlydoo.com so that it knows you're logged in (or whatever) and show you the content.
The webClient isn't doing this - so you're not logged in.
You'll have to capture the cookie somehow (if you're allowed to do this you will be able to) and send it off in the headers when making your request
There are also legitimate reasons for scraping. For instance we run a 3rd party web app on our intranet. I need to make a quick API for some simple tasks. It does require to login. Nothing fishy there. I think the term "Scraping" implies a negative spin on what is really just legitimate http interaction between 2 computers. Hackers code so coding is Hacking? I've worked for large fortune 500 corporations and seen them running a Macro Recorder program to batch access information from an old legacy DOS app. Sometimes you are asked to create fast APIs or in some instances the only API possible.
精彩评论