开发者

MD5/SHA "update" property?

What is the MD5/SHA property that allows you to "update" them? For example, if you have the hash for "test" you can add "case" to get the hash for "testcase". I would lik开发者_StackOverflow社区e to read up on this property a bit but my searches turn up nothing...


It is merely that they are actually calculated incrementally -- you calculate them by operating on the first n bytes of data, (128 in the case of MD5, see http://en.wikipedia.org/wiki/MD5#Algorithm), then on the next n bytes of data, etc.


EDIT: This isn't even theoretically possible, due to the 1-bit padding I mention below. In effect, md5("case", seed=md5("test")) == md5("test" + <1-bit> + "case"). There is no way to use md5("test") to incrementally compute md5("test" + "case").

This is theoretically possible if you concatenate 512-bit chunks. It won't work for appending "case" to "test", because the first run of the state machine is polluted by the padding used to turn "case" into a 512-bit chunk.

Additionally, the padding isn't just a bunch of zeros. The message is always first padded with a 1 bit, so that "case" and "case\0" produce different hashes. Thus you can't rely on "case" having the same hash with or without padding.


The MD5 algorithm has the following steps:

1) pad input string to a multiple of 64 bytes
2) split input string into blocks of 64 bytes
3) initialise state (a 4-element array)
4) for each block: state <= transform(state,block)
5) encode state as string

To support situations where you want to hash something in stages (e.g. large files), this can be refactored as follows.

Initialise:

1) initialise state
2) leftover bytes <= ""

Update:

1) append leftover bytes to start of input string
2) split input string into blocks of 64 bytes
3) for each complete block: state <= transform(state,block)
4) leftover bytes <= contents of the incomplete block, if one exists

Digest:

1) pad a copy of the leftover bytes
2) split the padded leftover bytes into blocks of 64 bytes
2) tmp_state <= state
2) for each block: tmp_state <= transform(tmp_state,block)
3) encode tmp_state as string

I've actually implemented this approach in VBA - it seems to work fine. Any suggestions for where I should upload the code?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜