Software/system to serve large quantities of images?
At our peak hour we need to serve around 250/rps. What we're doing is accepting a url for an image, pulling the image out of memcache, and returning it via Apache.
Our currently system is a dual-core machine with 4GB of memory: 2GB for the images in memcache and 2GB for Apache; but we're seeing a very high load (20-30) during our peak time. The average response time, as reported by Apache, is 30-80ms per request, which seems kind of slow for a simple Apache request served from memory.
Are there better tools for this? Serving from disk is not an option since the IO wait was holding it back, so we moved it to memory. How do CDN's do it?
EDIT: Well, the system works like this. A request comes in, we check a "queue" to see if we've seen this request before and if we have we serve the image(from disk...or memory). If not we increment the counter for that request in a memcached queue and there are worker machines that actually generate the image and t开发者_开发知识库hen store it back on the main server. So, currently when a request comes in we're checking the memcached db if it exists then we'll connecting to another db for the actual image database. When the images were on disk we found that just the file_exist function would take 30+ ms to completed so we moved it to memory. If we moved the images to a ramdisk would this speed up the file_exist or would we still want a first check to see if we should even seek the image out?
Have you looked at nginx?
According to Netcraft in May 2009 nginx served or proxied 3.25% busiest sites. It can serve from memcached too.
Depending on size of your image, Apache should handle this with no problem at all. We have an Apache serving 2000 request/seconds, the average size of response is 12K. The machine has 32GB memory so all our content is cached.
Here are some tuning tips,
- Use threaded MPM like worker, with lots of threads open (We have 256).
- Use mod_cache so all the images will be in memory
- Allocate as much memory as possible to the Apache process
When you say memcache, do you mean the memcached server? Running memcached will be slower because the latency on TCP connection (even though it's loopback) is much larger than direct memory access.
If you can fit all your images in memory, a RAM disk will also help a lot.
精彩评论