开发者

Compressing a hex string in Ruby/Rails

I'm using MongoDB as a backend for a Rails app I'm building. Mongo, by default, generates 24-character hexadecimal ids for its records to make sharding easier, so my URLs wind up looking like:

example.com/companies/4b3fc1400de0690bf2000001/employees/4b3ea6e30de0691552000001

Which is not very pretty. I'd like to stick to the Rails url conventions, but also leave these ids as they are in the database. I think a happy compromise would be to compress these hex ids to shorter collections using more characters, so they'd look something like:

example.com/companies/3ewqkvr5nj/employees/9srbsjlb2r

Then in my controller I'd reverse the compression, get the original hex id and use that to loo开发者_如何学Pythonk up the record.

My question is, what's the best way to convert these ids back and forth? I'd of course want them to be as short as possible, but also url-safe and simple to convert.

Thanks!


You could represent a hexadecimal id in a base higher than 16 to make its string representation shorter. Ruby has built-in support for working with bases from 2 up to 36.

b36 = '4b3fc1400de0690bf2000001'.hex.to_s(36)
# => "29a6dblglcujcoeboqp"

To convert it back to a 24-character string you could do something like this:

'%024x' % b36.to_i(36)
# => "4b3fc1400de0690bf2000001"

To achieve better "compression" you could represent the id in base higher than 36. There are Ruby libraries that will help you with that. all-your-base gem is one such library.

I recommend base 62 representation as it only uses 0-9, a-z and A-Z characters which means it is URL safe by default.


Even with base 62 representation you end up with still unwieldy 16-character ids:

'4b3fc1400de0690bf2000001'.hex.to_base_62  
# => "UHpdfMzq7jKLcvyr"

Sidestepping Rails convention a bit, another compromise is to use as the "URL id" the base 32 representation of the created_at date of the object.

aCompany.created_at
# => Sat Aug 13 20:05:35 -0500 2011
aCompany.created_at.to_i.to_s(32)
# => "174e7qv"

This way you get super short ids (7 characters) without having to keep track of a special purpose attribute (in MongoMapper, it's a simple matter of adding timestamps! in the model to get automatic created_at and updated_at attributes).


You can use base64 to make it shorter. Make sure that you are using '-' and '_' instead of '+' and '/'. You can also chop of the padding =.

Code to convert from a hex value to base 64

def MD5hex2base64(str)
  h1=[].clear

  # split the 32 byte hex into a 16 byte array
  16.times{ h1.push(str.slice!(0,2).hex) }
  # pack (C* = unsigned char), (m = base64 encoded output)
  [h1.pack("C*")].pack("m")
end
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜