开发者

Faster way to download pages using curl?

Hi,

I download a large amount of files for data mining. I used to use PHP for this purpose but I am finding it to be too slow. Also I just want a small part of the web page. I want to achieve two things

  1. Curl should be able to utilize all my download bandwidth
  2. Is there any way to download only a part of the web page where my data resides.

I am not confined to PHP. If curl 开发者_开发技巧works better in terminal I would use that.


Yes, you can download only a part of the page by using the CURLOPT_RANGE option, and you can also provide a write callback function that simply returns an error when you've received "enough" data and you want to stop and move on.


Are you downloading HTML? Your comment leads me to believe that you are. If that's the case, simply load up the html with Simple PHP DOM and get only the part that you want. Although, I find it hard to believe that grabbing just the HTML is slowing you down. Are you downloading any files or media as well?

Link : http://simplehtmldom.sourceforge.net/


There is no way to download only part of a page. When you request a URL, the server response is what it is.

Utilize more of your bandwidth by using cURL's ability to make multiple connections at once.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜