How does IMDb do its on-the-fly image resize and cropping?
I've been researching CDNs and image thumbnail generation and I was impressed with how IMDb does its image manipulation. Here's an example of a thumbnail version:
http://ia.media-imdb.com/images/M/MV5BMTc0MzU5ODQ5OF5BMl5BanBnXkFtZTYwODIwODk1._V1._SY98_CR1,0,67,98_.jpg
And here's a tweaked version that plays with size and cropping:
http://ia.media-imdb.com/images/M/MV5BMTc0MzU5ODQ5OF5BMl5BanBnXkFtZTYwODIwODk1._V1._SY400_CR10,40,213,314_.jpg
It seems pretty straight forward where everything from '.V1._...' on is used to determine how to manipulate the image. This is all done impressibly fast and I decided to find an existing solution that mimics this functionality.
I was able to find plenty of solutions in image re-sizing and I did find the Google App Engine's page on Transforming Images in Java. However, I don't think Amazon's IMDb is using Google to serve its images and since all my images are on Amazon's S3, I don't think I can use that solution.
After four hours of online searching, I decided to ask the intelligent crowd here.
Further context: I am building a web application on Amazon's Elastic Beanstalk and I'm thinking of ha开发者_如何学编程ving a separate server (perhaps another Beanstalk) to handle the images...similar to what IMDb is doing.
Thanks in advance for your insight.
I can't speak to IMDB's specific implementation, but I have implemented a similar solution on Amazon EC2 and S3 in the past. Here is an overview of my implementation:
All master (full-sized) images stored on S3, but NOT publicly accessible.
All image src urls point to an EC2 web server.
Smaller (thumbnail) versions of images also stored on S3 with a naming convention that identifies their size AND aspect ration:
- "myimage1_s200.jpg" is a square version of "myimage1.jpg" that has been resized as a 200x200 square.
- "myimage1_h100.jpg" has been resized to a maximum height of 100 (with a variable width that conforms to the original aspect ratio).
When the EC2 server receives a request for a specific image size: it checks to see if that size already exists, and if so it returns the existing image to the requester.
When the EC2 server receives a request for an image size that DOES NOT exist: it retrieves a copy of the next larger size version of the same image and resizes it and returns the new image to the requester, AND ALSO saves a copy to S3 for future use.
Performance Notes:
Pointing image src's directly to previously resized images on S3 is a lot faster if you know they exist!
Resizing the next larger version of the image vs always going back to the original is a LOT faster under load!
精彩评论