How many possible URLs can you make with the following characters?
I want to make a short U开发者_如何学GoRL service for 2 million assets but I want to use the shortest number of possible characters.
What is the math equation that I would need to use to figure it out? I know it has something to do with factorials, right?
It's not a factorial problem, but an exponential one.
If x
is the number of possible characters, you need to solve the following equation for y
:
x^y = 2000000
If you want to use all numbers and case-sensitive alpha [0-9A-Za-z]
, you have 62 possible values. This means you need to solve:
62^y = 2000000
y*log(62) = log(2000000)
y = log(2000000) / log(62)
y = 3.5154313828...
Of course, you can't have 3.5 characters in your URL, so you would need 4. If you want to change the character set you are using for your URL's, simply resolve the problem above using the number of values in your set.
Note Solving this equation assumes fixed-length URL's. For variable-length URL's, see Rob's answer.
@jheddings is close, and got the right answer, but the math was not quite correct. Don't forget you are not limited to all the permutations of characters of a specific length. You can also leverage URLs of length 1 through y characters. Therefore we want the closed value of this sum:
x + x^2 + x^3 + ... + x^y = 2000000
Fortunately, there is a closed form for that sum:
x + x^2 + x^3 + ... + x^y = x*(x^y - 1)/(x-1) = 2000000
x is the number of possible characters in our range. For simplicity sake, let's assume it only includes lowercase, uppercase, and numbers (26+26+10 = 62.)
Then we get the following equation:
2000000 = (62^(y+1) - 62)/(62-1)
2000000 = (62^(y+1) - 62)/(61)
2000000 * 61 = 62^(y+1) - 62
122000000 = 62^(y+1) - 62
122000000 + 62 = 62^(y+1)
122000062 = 62^(y+1)
log(122000062) = (y+1)
log(122000062) / log(62) = y+1
4.511492 = y+1
3.511492 = y
And, as you said, 3.5 characters is impossible so 4 are required. Admittedly the difference doesn't matter in this case. However, in certain scenarios (especially when dealing with base 2) it is very important.
Number of possible short URLs = (Number of possible different characters in ID) raised to the power of (Length of ID in url)
For instance, if you're only using lowercase characters (of which there are 26) and your URLs look like http://domain.com/XXXXX (for your unique id's of 5 characters), then you can make 26^5 = 11,881,376 short urls.
If you were using upper and lower case letters, you'd have 52, so 52^5 = 380,204,032 possible short URLs, et cetera.
You need to answer a number of questions, like what kinds of characters you want to allow in your set.
All letters and all digits? base 36 (5 characters can fit 2mil+)
Distinguish between upper and lowercase? That gets you to base 62 (4 characters)
Remove easily-mistaken characters and numbers (e.g. i/l 0/o)? roughly base 32 (also 5 characters)
You can often solve this kind of problem without any math wizardry.
26+26+10 = 62 characters
Try 1. 62 = 62
Try 2. 62*62 = 3,844
Try 3. 62*62*62 = 238,328
Try 4. 62*62*62*62 = 14,776,336
So 4 is your answer :)
According to the HTTP/URI Spec you can additionally use the following "unreserved characters": ALPHA / DIGIT / "-" / "." / "_" / "~"
That adds an additional 4 characters to your radix and thus
Math.log(2000000) / Math.log(66) = 3.4629721616408813
Although this still means you will end up with a 4 character URL path at maximum.
精彩评论