How to consistently serve assets from same CDN host?
So, in order to speed up load times, we're setting up a bunch of CDN hostnames to serve images and assets from. What's the best way to consistently use the same host for the same asset? E.g. button.gif always gets served from http://assets-15.ourserver.com.
I was thinking of coming up with some rule, where the md5 hash of the filename somehow maps to a server (can't use the filename itself, since a lot are similar: "button-home.gif", "button-about.gif", etc.). I'm not sure if this is the most efficient w开发者_Python百科ay, but it seems like it would work.
Anyone have any experience with this sort of thing? I need a language-agnostic solution, because this will be used by several different languages.
EDIT: Yahoo's explanation on how this speeds things up: http://developer.yahoo.com/performance/rules.html#split
When I did something like this, all the relevant resources had id numbers anyway, so I just used that as the basis. Still, it's not too hard to extend to non-numbers.
There's a balance in how many hostnames you use, with too many the host-lookup overhead outplays the advantage of multiple hostnames, so at the outside you'll likely have about 12, probably less.
This in itself means that a simple hash will likely split across the given range easily enough without any need to be particularly clever.
There's a lack of encoding issues confusion, because either your application deals with IURIs fully (in which case utf-8 handling is already an issue you've dealt with) or it doesn't, in which case every character in the URI-escaped form of the path (that is to say, the name used in the actual URI) is going to be in the ASCII range.
There's no need to by cryptographically secure or anything like that, as it isn't a security risk to guess the server used. It won't be the end of the world if one or two pages lean slightly to one server over another (randomness would have that happen with a perfect has anyway).
Hence just running through the characters in the absolute path of the URI for the image (everything after the host from the first / onwards) adding them their integer value to each other and then do use the modulo of that part of the hostname.
If you want to limit the number of characters processed for speed issues, then do it from the end backwards, as that will have the greatest variation.
That "button-home.gif" is similar to "button-about.gif" isn't an issue, as they aren't really very similar at all as seen through the eyes of a process like this.
If you ever increase the number of hostnames used, try to do it as a multiple of the previous number, as this results in the largest possible number of resources keeping their old URIs.
The point of a Content Delivery Network (CDN) is to load the asset form the server closest to the user.
Loading a given asset from an explicit server defeats the purpose of the CDN. My guess is that it's not supported. If you need to load an asset from an explicit location, don't put it on the CDN, put it on a central server.
精彩评论