开发者

Designing a Good Hash Function for Set Length URL Shortening in PHP

I'm working on a URL shortener. The input is a URL and the output needs to be a 4 character string (alphanumeric, case-sensitive).

I calculated it out that if I use 4 characters with a case-sensitive alphanumeric key space I should potentially be able to store 64^4 (16777216) URLs until I run out of space.

I also don't want my URL shortener to generate any short URLs tha开发者_Python百科t are offensive four letter words. It would be unfortunate if someone made a short URL that was domain.com/f**k. You get the picture...

Any ideas on the best way to go about this? I feel like I'll be using base64_encode somewhere in the process.


If I were you, I would make a case sensitive alphanumeric increment-er. Just increment, and assign the number to a database row. To check for bad words, just check against a black list. If it passes, great. If not, just increment again.

This way, instead of a hash algorithm, they're just in order. The first few would look like this:

id   | url
-------------------------
0000 | http://google.com
0001 | http://yahoo.com
0002 | http://example.com
...
000a | http://mail.google.com
000b | http://adobe.com
...
000A | http://microsof.com
...
0010 | http://w3.org
...
00a0 | http://youtube.com
...
00A0 | http://stackoverflow.com

And so on.

Here's a hint on how the function will work: http://us3.php.net/manual/en/function.ord.php

BTW, my math might be wrong, but I think it's (10 + 26 + 26) ^ 4 = 14776336

Edit: Just for the fun and the challenge, I wrote an incrementer function. When the max is reached, it returns false, so just compare it to false (with ===) when using it.

http://pastebin.com/957KPn4p


It vaguely reminded me of thisHow do I create unique IDs, like YouTube?. You just have to be sure to check (in a more limited space) the possibility of a collision.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜