What is the variance of java .class files across different compilers, versions, dependencies?
Hi I was wondering how much Java class files change across different compilers. So how much do the actual bytes change if a .java files is compiled by say a Sun JDK 1.4, 1.5 1.6 or even IBM JDK. I know that class files can be different with regards to debug information and obfuscation, but let's assume for the question that those options are the same, so debug information included, no obfuscation. If I ran a MD5 or SHA-1 has on a .class file that was compiled by JDK 1.4 would the Hash be different if I compiled it in JDK 1.5 but targeting 1.4 what when targeting JDK 1.5?
Also related to that, does a binary of a class file change when different dependencies are used, or asked differently can the binary of a class file开发者_开发技巧 change based on it's dependencies ?
And last but not least are there programmatic ways to analyse the metadata of a .class file in order to identify compiler version and or switches that were used when compiling it ?
The Java compilers have quite some freedom when creating classes and bytecode from source. They can reorder the methods, reorder the constant pool (with class names, method names and strings - this results in different method byte code, too) and reorder the actual byte code commands, as long as the result when executing them is the same.
So, using MD5 or similar hashes to prove that two class files came from the same source is not really sensible.
For the format of class files, see http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html
Yes, class files can and usually do change depending on which specific compiler is used to build them. There are many compiler implementation details which will result into different bytecode -- e.g. listing dependencies in different orders in the interfaces[] or fields[] arrays. Plus compilers are free to use different optimizations.
Adding or removing an "import" statement does not necessarily change the class file -- but using a class in one package instead of another certainly would. Not sure if this answers your 2nd question.
I don't believe compilers leave their identity in class files. Any such analysis would need to be indirect and most likely heuristic (in the lines of telling the author of a book by its style) -- unless you've got the source code and can compile with each compiler and compare.
Paŭlo has answered your question about hashing well. As for your other question:
Also related to that, does a binary of a class file change when different dependencies are used, or asked differently can the binary of a class file change based on it's dependencies ?
Yes. The class file contains signatures for all methods invoked, and these could have changed. Consider:
void test() {
Foo.bar(1,2);
}
where Foo in version 1 is defined by:
class Foo {
public static void bar(int x, int y) {
// do something
}
}
and in version 2 by:
class Foo {
public static <T> T bar(T... ts) {
// do something
}
}
精彩评论