AES decryption in iOS: PKCS5 padding and CBC
I am implementing for iOS some decryption code for a message originating on a server over which I have no control. A previous implementation on another platform documents the decryption requirements AES256, specifies the key and the initialization vector, and also says:
* Cipher Mode: CBC
* Padding: PKCS5Padding
The options for creation of a CCCryptor object include only kCCOptionPKCS7Padding and kCCOptionECBMode, noting that CBC is the default. From what what I understand about padding for encryption, I don't understand how one might use both; I thought they were mutually exclusive. In creating a CCCryptor for the decryption, I have tried using both a 0 for options and kCCOptionPKCS7Padding, but both give me gibberish after decryption.
I have compared the dump of this decryption with a dump of the decoded byte buffer on the other platform and confirmed that they really are different. So there is something that I am doing different in this implementation that is significantly different, I just don't know what... And don't have a clue as to how to get a handle on it. The platforms are different enough that it is difficult to infer much from the previous implementation because it is based on a very different 开发者_C百科platform. And of course, the author of the previous implementation has since departed.
Any guesses what else could be incompatible or how to troubleshoot this thing?
PKCS#5 padding and PKCS#7 padding are practically the same (adding bytes 01, or 0202, or 0303 etc up to the length of the block size of the algorithm, 16 bytes in this case). Officially PKCS#5 padding should only be used for 8 byte blocks, but in many runtimes the two can be interchanged without issue. Padding always occurs at the end of the ciphertext, so if you get just gibberish it's not the padding. ECB is a block mode of operation (that should not be used to encrypt data that can be distinguished from random numbers) : it would require padding, so the two are not mutually exclusive.
Finally, if you just perform decryption (not MAC'ing or other forms of integrity control), and you return the result of the unpadding to the server (decryption failed), your plain text data is not safe because of padding oracle attacks.
First, you can worry about the padding later. Providing 0
like you have done means AES CBC with no padding, and with that configuration you should see your message just fine. Albiet potentially with some padding bytes on the end. So that leaves:
- You're not loading the key correctly.
- You're not loading the IV correctly.
- You're not loading the data correctly.
- The server is doing something you don't expect.
To debug this, you need to isolate your system. You can do this by implementing a loopback test where you both encrypt and then decrypt the data to make sure you're loading everything correctly. But that can be misleading. Even if you do something wrong (e.g., loading the key backwards), you could still be able to decrypt what you've encrypted because you're doing it exactly the same wrong way on both sides.
So you need to test against Known Answer Tests
(KATs). You can look up the official KATs on the AES wikipedia entry. But it just so happens that I have posted another answer here on SO that we can use.
Given this input:
KEY: 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
IV: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
PLAIN TEXT: encrypt me
CIPHER TEXT: 338d2a9e28208cad84c457eb9bd91c81
Verify with a third-party program that you can decrypt the cipher text and get the plain text.
$ echo -n "encrypt me" > to_encrypt
$ openssl enc -in to_encrypt -out encrypted -e -aes-256-cbc \
> -K 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f \
> -iv 0000000000000000
$ hexdump -C encrypted
00000000 33 8d 2a 9e 28 20 8c ad 84 c4 57 eb 9b d9 1c 81 |3.*.( ....W.....|
00000010
$ openssl enc -in encrypted -out plain_text -d -aes-256-cbc \
> -K 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f \
> -iv 0000000000000000
$ hexdump -C plain_text
00000000 65 6e 63 72 79 70 74 20 6d 65 |encrypt me|
0000000a
So now try to decrypt this known-answer test in your program. Be sure to enable PKCS7 padding, because that's what I used in this example. As an exercise, decrypt it with no padding and see that the result is the same, except you have padding bytes after the "encrypt me" text.
Implementing the KAT is a big step. It says that your implementation correct, but your assumptions on the server's behavior is wrong. And then it's time to start questioning those assumptions...
(And P.S., those options you mentioned are not mutually exclusive. ECB means no IV, and CBC means you have an IV. No relation to padding.)
OK, I know I said it's an exercise, but I want to prove that even if you encrypt with padding
and decrypt without padding, you do not get garbage. So given the KAT that used PKCS7 padding, we decrypt it with the no padding option and get a readable message followed by 06
used as a padding byte.
$ openssl enc -in encrypted -out plain_text -d -aes-256-cbc \
-K 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f \
-iv 0000000000000000 -nopad
$ hexdump -C plain_text
00000000 65 6e 63 72 79 70 74 20 6d 65 06 06 06 06 06 06 |encrypt me......|
00000010
$
Paul,
The PKCS#5 padding is needed to identify padding in the decrypted data. For CBC, the input buffer must be a multiple of the cipher block size (16 for AES). For that reason, the buffer to be encrypted is extended with additional bytes. Note that after encryption, the original size of the data is lost. PKCS#5 padding allows to retrieve that size. This is done by filling the extended data buffer with repeated bytes, with value equal to the padding size. e.g if your cleartext buffer was 12 bytes, to make it multiple of 16, you will need to add 4 bytes more. (If the data was 16, you will add 16 more to make it 32). Then you fill those 4 bytes with '0x4' to conform with PKCS#5 padding. When you decrypt, simply look for the last byte in the decrypted data and subtract that number from the length of the decrypted buffer.
What you are doing is padding with '0's. Although you seem to be happy to see the results, you will get a surprise when your original data ends in one of more '0's.
It turns out that the explanation for the what I was experiencing was embarrassingly simple: I misinterpreted something I read in the previous implementation to imply that it was using a 256-bit key, but in fact it was using a 128-bit key. Make that change and all of the sudden what was obscure becomes cleartext. :-)
0 for the options argument, to invoke CBC, was in fact correct. What the reference to PKCS5 padding in the previous implementation is still mysterious, but that doesn't matter because because what I've got now works.
Thanks for the shot, indiv.
精彩评论