Unable to verify body hash for DKIM
I'm writing a C# DKIM validator and have come across a problem that I cannot solve. Right now I am working on calculating the body hash, as described in Section 3.7 Computing the Message Hashes. I am working with emails that I have dumped using a modified version of EdgeTransportAsyncLogging sample in the Exchange 2010 Transport Agent SDK. Instead of converting the emails when saving, it just opens a file based on the MessageID and dumps the raw data to disk.
I am able to successfully compute the body hash of the sample email provided in Section A.2 using the following code:
SHA256Managed hasher = new SHA256Managed();
ASCIIEncoding asciiEncoding = new ASCIIEncoding();
string rawFullMessage = File.ReadAllText(@"C:\Repositories\Sample-A.2.txt");
string headerDelimiter = "\r\n\r\n";
int headerEnd = rawFullMessage.IndexOf(headerDelimiter);
string header = rawFullMessage.Substring(0, headerEnd);
string body = rawFullMessage.Substring(headerEnd + headerDelimiter.Length);
byte[] bodyBytes = asciiEncoding.GetBytes(body);
byte[] bodyHash = hasher.ComputeHash(bodyBytes);
string bodyBase64 = Convert.ToBase64String(bodyHash);
string expectedBase64 = "2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8=";
Console.WriteLine("Expected hash: {1}{0}Computed hash: {2}{0}Are equal: {3}",
Environment.NewLine, expectedBase64, bodyBase64, expectedBase64 == bodyBase64);
The output from the above code is:
Expected hash: 2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8=
Computed hash: 2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8=
Are equal: True
Now, most emails come across with the c=relaxed/relaxed
setting, which requires you to do some work on the body and header before hashing and verifying. And while I was working on it (failing to get it to w开发者_如何学Goork) I finally came across a message with c=simple/simple
which means that you process the whole body as is minus any empty CRLF
at the end of the body. (Really, the rules for Body Canonicalization are quite ... simple.)
Here is the real DKIM email (right click and save it, the browsers eat the ending CRLF
) with a signature using the simple algorithm (completely unmodified). Now, using the above code and updating the expectedBase64
hash I get the following results:
Expected hash: VnGg12/s7xH3BraeN5LiiN+I2Ul/db5/jZYYgt4wEIw=
Computed hash: ISNNtgnFZxmW6iuey/3Qql5u6nflKPTke4sMXWMxNUw=
Are equal: False
The expected hash is the value from the bh=
field of the DKIM-Signature
header. Now, the file used in the second test is a direct raw output from the Exchange 2010 Transport Agent. If so inclined, you can view the modified EdgeTransportLogging.txt.
At this point, no matter how I modify the second email, changing the start position or number of CRLF
at the end of the file I cannot get the files to match. What worries me is that I have been unable to validate any body hash so far (simple or relaxed) and that it may not be feasible to process DKIM through Exchange 2010.
I tried this in python-dkim and I get a body hash mismatch too.
I think probably Exchange's GetMimeReadStream
is not giving you the actual bytes as they were transmitted, therefore the hash doesn't match. Probably it's disassembling the message into its mime parts, and then GetMimeReadStream gives you a valid representation of the message, but not the one it was originally sent with.
Perhaps there's another API that will give you the real raw bytes?
Or perhaps by this point in the process the message has been torn apart and the original message thrown away, and you need to hook in earlier.
Probably you should try intercepting a DKIM-signed message by sending it to a non-Exchange server, and see if that works with your code. GetContentReadStream
might possibly work?
Anyhow, what I would do next is try to find an API that gives you byte-for-byte what was sent.
精彩评论