Why do these two files hash to the same value when I use MemoryStream?
I'm writing a c# routine that creates hashes from jpg files. If I pass in a byte array to my SHA512 object then I get the expected behavior, however, if I pass in a memory stream the two files always hash to the same value.
Example 1:
SHA512 mySHA512 = SHA512.Create();
Image img1 = Image.FromFile(@"d:\img1.jpg");
Image img2 = Image.FromFile(@"d:\img2.jpg");
MemoryStream ms1 = new MemoryStream();
MemoryStream ms2 = new MemoryStream();
img1.Save(ms1, ImageFormat.Jpeg);
byte[] buf1 = ms1.GetBuffer();
byte[] hash1 = mySHA512.ComputeHash(buf1);
img2.Save(ms2, ImageFormat.Jpeg);
byte[] buf2 = ms2.GetBuffer();
byte[] hash2 = mySHA512.ComputeHash(buf2);
if (Convert.ToBase64String(hash1) == Convert.ToBase64String(hash2))
MessageBox.Show("Hashed the same");
else
MessageBox.Show("Different hashes");
That produces "Different hashes". But one of the overloads of the ComputeHash method takes a stream object in and I'd rather use that. When I do:
SHA512 mySHA512 = SHA512.Create();
Image img1 = Image.FromFile(@"d:\img1开发者_JS百科.jpg");
Image img2 = Image.FromFile(@"d:\img2.jpg");
MemoryStream ms1 = new MemoryStream();
MemoryStream ms2 = new MemoryStream();
img1.Save(ms1, ImageFormat.Jpeg);
byte[] hash1 = mySHA512.ComputeHash(ms1);
img2.Save(ms2, ImageFormat.Jpeg);
byte[] hash2 = mySHA512.ComputeHash(ms2);
if (Convert.ToBase64String(hash1) == Convert.ToBase64String(hash2))
MessageBox.Show("Hashed the same");
else
MessageBox.Show("Different hashes");
That produces "Hashed the same".
What's going on here that I'm missing?
You're not rewinding your MemoryStreams, so the hash is computed from an empty sequence of bytes. Use
ms1.Position = 0;
ms2.Position = 0;
after calling Save
.
One further note: don't use GetBuffer
in this way. Use ToArray
which will give you a byte array the same size as the stream's length - GetBuffer
returns the raw buffer which will (usually) have some padding, which you wouldn't want to use accidentally. You can use GetBuffer
if you then make sure you only use the relevant portion of it, of course - this avoids creating a new copy of the data.
精彩评论