How can I defeat RC4-like obfuscation?
I am trying to read data files generated by a program originally written in Visual Basic 6 (and later rewritten in Java) so I can process them using my own tools.
The program in question is public domain software created by the U.S. Government; there is no license agreement that prohibits this. I cannot me开发者_开发百科ntion the program's name or link to its web site because the programmer probably would change the obfuscation in next year's version, and I would have to repeat my reverse engineering effort.
The underlying data file format is text based, and the obfuscation is some kind of stream cipher with a hardcoded key. I can XOR data files together to get some of the data out (filling one of the string fields with a repeating ASCII character), but I would like to avoid embedding the entire keystream within my program.
Searching through the .exe file reveals a call to a subroutine named RC4ini and a string that I believe is the key (it does not appear anywhere in the user interface). I found what could be the source code to this encryption library on Planet Source Code, made the correct changes to a working implementation of RC4 (in JavaScript, as that is the programming language I mostly work in), and tried using it.
I attempted to search for the encrypted data at every offset in the file, but I did not succeed in decryption. Why this is happening?
If they are using RC4 you have a few options.
One option is to find out when they are calling RC4 and dump the key or plaintext message. This is easy to do using a debugger like Windbg or perhaps ollydbg. Fundamentally they are breaking a crypto law and all DRM will fail because of this property.
Another attack is that if the same key is used for 2 messages, if you know the plain text of one message then you can xor it with its corresponding cipher text to reveal the PRNG stream. This PRNG stream can then be XOR'ed with the cipher text of an unknown message to obtain its corresponding plaintext. Naturally if the key is different for each message, (such as the use of an IV), then this attack will not work.
By searching for the suspected encryption key using Google, I found the developer had later (possibly even the same month) open-sourced the Java code, which clearly identifies the string I had found as an encryption key. Subsequently, he refactored the key into a separate class that he put in svn:ignore
(and presumably changed it; I have not checked yet.)
It turns out that my decryption program did not exactly match the Planet Source Code implementation. Here is Wikipedia's description of the RC4 PRGA:
i := 0
j := 0
while GeneratingOutput:
i := (i + 1) mod 256
j := (j + S[i]) mod 256
swap values of S[i] and S[j]
K := S[(S[i] + S[j]) mod 256]
output K
endwhile
Both the Visual Basic and the Java code omitted the first two lines (i := 0 and j := 0).
精彩评论