开发者

Choosing a thumbnail from an external link

I am trying to build a script that retrieves a list of thumbnail images from an external link, much like Facebook does when you share a link and can choose the thumbnail image that is associated with that post.

My script currently works like this:

  • file_get_contents on the URL
  • preg_match_all to match any <img src="" in the contents
  • Works out the full URL to each image and stores it in an array
  • If there are < 10 images it loops through and uses getimagesize to find width and height
  • If there are > 10 images it loops through and uses fread and imagecreatefromstring to find width and height (for speed)
  • Once all width and heights are worked out it loops through and only adds the images to a new array that have a minimum width and height (so only larger images are shown, smaller images are less likely to be descriptive of the URL)
  • Each image has its new dimensions worked out (scaled down proportionally) and are returned...

<img src="'.$image[0].'" width="'.$image[1].'" height="'.$image[2].'"><br><br>

At the moment this works fine, but there are a number of problems I can potentially have:

  1. SPEED! If the URL has many images on the page it will take considerably longer to process
  2. MEMORY! Using getimagesize or fread & imagecreatefromstring will store the whole image in memory, any large images on the page could eat up the server's memory and kill my script (and server)

One solution I have found is being able to retrieve the image width and height from the header of the image without having to download the whole image, though I have only found some code to do this for JPG's (it would need to support GIF & PNG).

Can anyone make any suggestions to help me with either problem mentioned above, or perhaps you can suggest another way of doing this I am open to ideas... Thanks!

** Edit: Code below:

// Example images array
$images = array('http://blah.com/1.jpg', 'http://blah.com/2.jpg');

// Find the image sizes
$image_sizes = $this->image_sizes($images);

// Find the images that meet the minimum size
for ($i = 0; $i < count($image_sizes); $i++) {
    if ($image_sizes[$i][0] >= $min || $image_sizes[$i][1] >= $min) {                
        // Scale down the original image size
        $dimensions = $this->resize_dimensions($scale_width, $scale_height, $image_sizes[$i][0], $image_sizes[$i][1]);
        $img[] = array($images[$i], $dimensions['width'], $dimensions['height']);
    }
}

// Output the images
foreach ($img as $image) echo '<img src="'.$image[0].'" width="'.$image[1].'" height="'.$image[2].'"><br><br>';

/**
 * Retrieves the image sizes
 * Uses the getimagesize() function or the filesystem for speed increases
 */
public function image_sizes($images) {
    $out = array();
    if (count($images) < 10) {
        foreach ($images as $image) {
            list($width, $height) = @getimagesize($image);
            if (is_numeric($width) && is_numeric($height)) {
                $out[] = array($width, $height);
            }
            else {
                $out[] = array(0, 0);
            }
        }
    }
    else {
        foreach ($images as $image) {
            $handle = @fopen($image, "rb");
            $contents = "";
            if ($handle) {
                while(true) {
                    $data = fread($handle, 8192);
                    if (strlen($data) == 0) break;
                    $contents .= $data;
                }
                fclose($handle);
                $im = @imagecreatefromstring($contents);
                if ($im) {
                    $out[] = array(imagesx($im), imagesy($im));
                }
                else {
                    $out[] = array(0, 0);
                }
                @imagedestroy($im);
            }
            else {
                $out[] = array(0, 0);
            }
        }
    }
    return $out;
}

/**
 * Calculates restricted dimensions with a maximum of $开发者_如何学编程goal_width by $goal_height 
 */
public function resize_dimensions($goal_width, $goal_height, $width, $height) {
    $return = array('width' => $width, 'height' => $height);

    // If the ratio > goal ratio and the width > goal width resize down to goal width
    if ($width/$height > $goal_width/$goal_height && $width > $goal_width) {
        $return['width'] = floor($goal_width);
        $return['height'] = floor($goal_width/$width * $height);
    }

    // Otherwise, if the height > goal, resize down to goal height
    else if ($height > $goal_height) {
        $return['width'] = floor($goal_height/$height * $width);
        $return['height'] = floor($goal_height);
    }   
    return $return;
}


getimagesize reads only header, but imagecreatefromstring reads whole image. Image read by GD, ImageMagick or GraphicsMagic is stored as bitmap so it consumes widthheight(3 or 4) bytes, and there's nothing you can do about it. The best possible solution for your problem is to make curl multi-request (see http://ru.php.net/manual/en/function.curl-multi-select.php ), and then one by one process recieved images with GD or any other library. And to make memory consumption a bit lower, you can store image files on disk, not in memory.


The only idea that comes to mind for your current approach (which is impressive) is to check the HTML for existing width and height attributes and skip the file read process altogether if they exist.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜