开发者

How to scrape website with login required - example.com [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 10 years ago.

I try to screen scrape a website using .net (Webclient, webrequest,response etc.) I've tried many methods but nothing seems to work.

I always开发者_如何学运维 get "Please login to see this content!" site instead of full auction info : http://www.example.com/en/auctions/auto-details/107891/

i am sending login data with post method

Please help


Its because when you view it through your browser the authentication cookie is being sent to squiddlydoo.com so that it knows you're logged in (or whatever) and show you the content.

The webClient isn't doing this - so you're not logged in.

You'll have to capture the cookie somehow (if you're allowed to do this you will be able to) and send it off in the headers when making your request


There are also legitimate reasons for scraping. For instance we run a 3rd party web app on our intranet. I need to make a quick API for some simple tasks. It does require to login. Nothing fishy there. I think the term "Scraping" implies a negative spin on what is really just legitimate http interaction between 2 computers. Hackers code so coding is Hacking? I've worked for large fortune 500 corporations and seen them running a Macro Recorder program to batch access information from an old legacy DOS app. Sometimes you are asked to create fast APIs or in some instances the only API possible.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜