开发者

Ruby on Rails - generating bit.ly style identifiers

I'm trying to generate UUIDs with the same style as bit.ly urls like:

http://bit [dot] ly/aUekJP

or cloudapp ones:

http://cl [dot] ly/1hVU

which are even smaller

how can I do it? I'm now using UUID gem for ruby but I'm not sure if it's possible to limitate the length and get something like this. I am currently using this:

UUID.generate.split("-")[0] => b9386070

But I would like to have even smaller and knowing that it will be unique.

Any help would be pretty much appreciated :)


edit note: replaced dot letters with [dot] for开发者_如何学运维 workaround of banned short link


You are confusing two different things here. A UUID is a universally unique identifier. It has a very high probability of being unique even if millions of them were being created all over the world at the same time. It is generally displayed as a 36 digit string. You can not chop off the first 8 characters and expect it to be unique.

Bitly, tinyurl et-al store links and generate a short code to represent that link. They do not reconstruct the URL from the code they look it up in a data-store and return the corresponding URL. These are not UUIDS.

Without knowing your application it is hard to advise on what method you should use, however you could store whatever you are pointing at in a data-store with a numeric key and then rebase the key to base32 using the 10 digits and 22 lowercase letters, perhaps avoiding the obvious typo problems like 'o' 'i' 'l' etc

EDIT

On further investigation there is a Ruby base32 gem available that implements Douglas Crockford's Base 32 implementation

A 5 character Base32 string can represent over 33 million integers and a 6 digit string over a billion.


If you are working with numbers, you can use the built in ruby methods

6175601989.to_s(30)
 => "8e45ttj" 

to go back

"8e45ttj".to_i(30)
=>6175601989

So you don't have to store anything, you can always decode an incoming short_code.

This works ok for proof of concept, but you aren't able to avoid ambiguous characters like: 1lji0o. If you are just looking to use the code to obfuscate database record IDs, this will work fine. In general, short codes are supposed to be easy to remember and transfer from one medium to another, like reading it on someone's presentation slide, or hearing it over the phone. If you need to avoid characters that are hard to read or hard to 'hear', you might need to switch to a process where you generate an acceptable code, and store it.


I found this to be short and reliable:

def create_uuid(prefix=nil)
  time   = (Time.now.to_f * 10_000_000).to_i
  jitter = rand(10_000_000) 
  key    = "#{jitter}#{time}".to_i.to_s(36)
  [prefix, key].compact.join('_')
end

This spits out unique keys that look like this: '3qaishe3gpp07w2m'
Reduce the 'jitter' size to reduce the key size.

Caveat: This is not guaranteed unique (use SecureRandom.uuid for that), but it is highly reliable:

10_000_000.times.map {create_uuid}.uniq.length == 10_000_000


The only way to guarantee uniqueness is to keep a global count and increment it for each use: 0000, 0001, etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜