MD5/SHA "update" property?
What is the MD5/SHA property that allows you to "update" them? For example, if you have the hash for "test" you can add "case" to get the hash for "testcase". I would lik开发者_StackOverflow社区e to read up on this property a bit but my searches turn up nothing...
It is merely that they are actually calculated incrementally -- you calculate them by operating on the first n bytes of data, (128 in the case of MD5, see http://en.wikipedia.org/wiki/MD5#Algorithm), then on the next n bytes of data, etc.
EDIT: This isn't even theoretically possible, due to the 1-bit padding I mention below. In effect, md5("case", seed=md5("test")) == md5("test" + <1-bit> + "case")
. There is no way to use md5("test")
to incrementally compute md5("test" + "case")
.
This is theoretically possible if you concatenate 512-bit chunks. It won't work for appending "case" to "test", because the first run of the state machine is polluted by the padding used to turn "case" into a 512-bit chunk.
Additionally, the padding isn't just a bunch of zeros. The message is always first padded with a 1 bit, so that "case" and "case\0" produce different hashes. Thus you can't rely on "case" having the same hash with or without padding.
The MD5 algorithm has the following steps:
1) pad input string to a multiple of 64 bytes
2) split input string into blocks of 64 bytes
3) initialise state (a 4-element array)
4) for each block: state <= transform(state,block)
5) encode state as string
To support situations where you want to hash something in stages (e.g. large files), this can be refactored as follows.
Initialise:
1) initialise state
2) leftover bytes <= ""
Update:
1) append leftover bytes to start of input string
2) split input string into blocks of 64 bytes
3) for each complete block: state <= transform(state,block)
4) leftover bytes <= contents of the incomplete block, if one exists
Digest:
1) pad a copy of the leftover bytes
2) split the padded leftover bytes into blocks of 64 bytes
2) tmp_state <= state
2) for each block: tmp_state <= transform(tmp_state,block)
3) encode tmp_state as string
I've actually implemented this approach in VBA - it seems to work fine. Any suggestions for where I should upload the code?
精彩评论