Compressing a hex string in Ruby/Rails
I'm using MongoDB as a backend for a Rails app I'm building. Mongo, by default, generates 24-character hexadecimal ids for its records to make sharding easier, so my URLs wind up looking like:
example.com/companies/4b3fc1400de0690bf2000001/employees/4b3ea6e30de0691552000001
Which is not very pretty. I'd like to stick to the Rails url conventions, but also leave these ids as they are in the database. I think a happy compromise would be to compress these hex ids to shorter collections using more characters, so they'd look something like:
example.com/companies/3ewqkvr5nj/employees/9srbsjlb2r
Then in my controller I'd reverse the compression, get the original hex id and use that to loo开发者_如何学Pythonk up the record.
My question is, what's the best way to convert these ids back and forth? I'd of course want them to be as short as possible, but also url-safe and simple to convert.
Thanks!
You could represent a hexadecimal id in a base higher than 16
to make its string representation shorter. Ruby has built-in support for working with bases from 2
up to 36
.
b36 = '4b3fc1400de0690bf2000001'.hex.to_s(36)
# => "29a6dblglcujcoeboqp"
To convert it back to a 24-character string you could do something like this:
'%024x' % b36.to_i(36)
# => "4b3fc1400de0690bf2000001"
To achieve better "compression" you could represent the id in base higher than 36
. There are Ruby libraries that will help you with that. all-your-base
gem is one such library.
I recommend base 62
representation as it only uses 0-9
, a-z
and A-Z
characters which means it is URL safe by default.
Even with base 62 representation you end up with still unwieldy 16-character ids:
'4b3fc1400de0690bf2000001'.hex.to_base_62
# => "UHpdfMzq7jKLcvyr"
Sidestepping Rails convention a bit, another compromise is to use as the "URL id" the base 32 representation of the created_at
date of the object.
aCompany.created_at
# => Sat Aug 13 20:05:35 -0500 2011
aCompany.created_at.to_i.to_s(32)
# => "174e7qv"
This way you get super short ids (7 characters) without having to keep track of a special purpose attribute (in MongoMapper, it's a simple matter of adding timestamps!
in the model to get automatic created_at
and updated_at
attributes).
You can use base64 to make it shorter. Make sure that you are using '-' and '_' instead of '+' and '/'. You can also chop of the padding =.
Code to convert from a hex value to base 64
def MD5hex2base64(str)
h1=[].clear
# split the 32 byte hex into a 16 byte array
16.times{ h1.push(str.slice!(0,2).hex) }
# pack (C* = unsigned char), (m = base64 encoded output)
[h1.pack("C*")].pack("m")
end
精彩评论