How to generate a unique text id for a online resource
I want to generate a unique id which will be used in URLs to identify a specific resource. In principle it is the same as pastebin.com etc. does.
The id and resource is not very secret but I want it to be so you just can't decrement a id and then get another users resource. I´m thinking of a CHAR(8)
which will look nice in a URL and still be large enough to reduce the chance of guesses. But how do I generate this? For an INT, you can use auto_increment
and primary key
to insure the uniqueness.
But if I do the following in order
- Generate a
CHAR(8)
in my application - Insure that this ID doesn'开发者_开发知识库t exists.
- If it not exists, store, else goto 1.
I have to wrap 2. and 3. in a atomic transaction.
But is there a better way? or shouldn't I care about the the check (2.) because a clash doesn't occur regularly. I use MySql and .Net (C#) if that helps. Is it possible to somehow 'encrypt' a auto-incremented int as the text-id and decrypt it again in precisely 8 (or 10) characters.
I have read Need a smaller alternative to GUID for DB ID but still unique and random for URL which was useful, but the use of GUID is not supported in MySql (as far as I know). But a comment on the quality of the LongToShortCode
method in the thread would also be appreciated.
Note: the resources can't be changed, only viewed.
Best regards, Lasse
MySql implements the UUID. Which appears to be a GUID with a different name. So that option is still available for you.
If you are still going to use char(8)
, then you do need to worry about uniqueness of your ID, simply because if you're looking at served URLs, you may not know a violation has occurred until people start reporting problems.
You could use an int identity and then encrypt/decrypt it before using it, probably not the best idea under heavy load though.
The easyish way of doing that pseudo-atomically is to
- generate the random string
- store the string (in effect reserving it)
- check if another one exists
- if another one exists, remove the one you just made, and return to step 1
Collisions can still happen, but when they do it causes both threads to try again, which in this case isn't a problem.
EDIT: I would suggest taking the first few characters of a cryptographic hash or something for your generation function, but it doesn't really matter.
I think I'll do it like this: A 8 character text id can store a number up to 64^8 = 2^48.
I will then use two columns:
- ID, INT 2^32 auto-increment
- Rand, INT 2^16
Then, when I add a row, I will generate a random 2^16 integer and put it in a new row. The text id is then simply generated from from the two numbers combined. And retrieval is easy two - just splitting it up and a simple lookup in the database. Ridiculous simple solution which should eliminate row clashes and be random enough (2^16) to reduce guesses.
Feedback on this approach will be appreciated.
精彩评论