开发者

Logging into a website through a proxy

I need to develop a system that would log into a website, at regular intervals, through a remote server (I believe "proxy" is the term) and collect data from that website.

What would be the basic requirements, for a system like that, in terms of servers an software? Would I need more than a typical shared hosting plan?

I'm looking for a software solution that is based on PHP.

Edit: The c开发者_如何学Collected data will be used for statistical purposes only - nothing illegal.


You can use the PHP curl functions to request a page.
And Curl allows you to set a proxy like so:

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_PROXY, "http://proxyaddress"); 
curl_setopt($ch, CURLOPT_PROXYPORT, 8080); 
curl_setopt($ch, CURLOPT_PROXYUSERPWD, "xxx:xxx"); 

And I guess the reason for the downvotes is that it seems like you are trying to steal a design, but I guess you have a completely legit reason for doing what you want to do!


What you are trying to do is to create a web crawler. That is how search engines index the web pages they search. This crawling is done by scripts called spiders or robots. It can be created easily using Perl. Check this http://www.linuxjournal.com/article/2200 for a simple tutorial.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜