开发者

Can zlib.crc32 or zlib.adler32 be safely used to mask primary keys in URLs?

In Django Design Patterns, the author recommends using zlib.crc32 to mask primary keys in URLs. After some quick testing, I noticed that crc32 produces negative integers about half the tim开发者_Python百科e, which seems undesirable for use in a URL. zlib.adler32 does not appear to produce negatives, but is described as "weaker" than CRC.

  1. Is this method (either CRC or Adler-32) safe for usage in a URL as an alternate to a primary key? (i.e. is it collision-safe?)
  2. Is the "weaker" Adler-32 a satisfactory alternative for this task?
  3. How the heck do you reverse this?! That is, how do you determine the original primary key from the checksum?


The problem is not hashing the value. The problem is mapping the hash back to the key. Even if there are collisions you could always increment until you hit an unused hash.

The reason hashes are used for e.g. authentication is because there's already a key (e.g the username) that can be used to find the appropriate record. At that point it becomes just a matter of comparing the given hash against the stored hash. If you're using the hash to mask the key instead then it will be trickier than just comparing it. Turning the hash itself into the key will solve this though.


You can interpret the 32 bit CRC value as an unsigned integer.


Upon further investigation, this seems like a really bad idea:

In [11]: s = set([zlib.crc32(str(x)) for x in xrange(20000000)])
In [12]: len(s)
Out[12]: 19989760
In [13]: 20000000 - len(s)
Out[13]: 10240

That's 10,240 collisions in 20,000,000 primary keys.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜