Downloading multiple images using PHP cURL [duplicate]
I want to download images from a web page, for example, www.yahoo.com, and store it in a folder using PHP.
I am getting the page source using file_get_contents() and extracting the img src tag. I am passing this src to cURL code. The code does not give any error, but the images are not getting downloaded. Please check out the code. I am not getting where I am going wrong.
<?php
$html = file_get_contents('www.yahoo.com');
$ptn = '/< *img[^>]*src *= *["\']?([^"\']*)/i';
preg_match_all($ptn, $html, $matches, PREG_PATTERN_ORDER);
$seq = 1;
foreach($matches as $img)
{
$fp = fopen("root/Images/image_$seq.jpg", 'wb');
$ch = curl_init ($img);
curl_setopt($ch,CURLOPT_FILE, $fp);
curl_setopt($ch,CURLOPT_URL, $img);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
$image = curl_exec($ch);
curl_close($ch);
fwrite($fp, $image);
fclose($fp);
$seq++;
}
echo "IMAGES DOWNLOADED";
?>
foreach($matches as $img)
should be changed to
foreach($matches[1] as $img)
BTW: you should replace the file_get_contents by cURL, it's about 3x as fast;)
- Is $img the full URL of the image?
Is the image protected (use referer)?
$image = false; $ch = curl_init(); curl_setopt($ch, CURLOPT_REFERER,$url); curl_setopt($ch, CURLOPT_URL, $url ); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_TIMEOUT, 7); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch,CURLOPT_ENCODING,gzip); curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); $image = curl_exec ($ch);
Try debugging first.
First try it with a single image from Yahoo, http://www.depers.nl/beeld/w100/2011/201105/20110510/anp/sport/img-100511-349.onlinebild.jpg
.
Also, why use file_get_contents and curl? Use curl instead.
- Make a function for cURL:
function simple_curl ( $url,$binary=false){ set your cURL vars, return curl_exec)
. - Get yahoo.com:
$result = simple_curl($url);
- Get links with the pattern (check if the matches contains the full URL ( domain + directory + file ).
- Loop each pattern match (don't forget: multi array!! So loop on
$matches[1]
). - curl binary file and save it:
$image = simple_curl($match,true);
www.yahoo.com
is not a URL,http://www.yahoo.com/
is.- $img is an array you need to iterate
$matches[1]
- You both tell cURL to write to a file and retrieve the result. Use one.
I don't know how you don't see errors. I would look into that. Copying and pasting and then running it gave me plenty of errors.
精彩评论