Why is Java faster when using a JIT vs. compiling to machine code?
I have heard that Java must use a JIT to be fast. This makes perfect sense when comparing to interpretation, but why can't someone make an ahead-of-time compiler that generates fast Java code? I know about gcj
, but I don't think its output is typically faster than Hotspot for example.
Are there things about the language that make this difficult? I think it comes down to just these things:
- Reflection
- Classloading
What am I missing? If I avoid these features, would it be possible to compile Java code once t开发者_运维问答o native machine code and be done?
A JIT compiler can be faster because the machine code is being generated on the exact machine that it will also execute on. This means that the JIT has the best possible information available to it to emit optimized code.
If you pre-compile bytecode into machine code, the compiler cannot optimize for the target machine(s), only the build machine.
I will paste an interesting answer given by the James Gosling in the Book Masterminds of Programming.
Well, I’ve heard it said that effectively you have two compilers in the Java world. You have the compiler to Java bytecode, and then you have your JIT, which basically recompiles everything specifically again. All of your scary optimizations are in the JIT.
James: Exactly. These days we’re beating the really good C and C++ compilers pretty much always. When you go to the dynamic compiler, you get two advantages when the compiler’s running right at the last moment. One is you know exactly what chipset you’re running on. So many times when people are compiling a piece of C code, they have to compile it to run on kind of the generic x86 architecture. Almost none of the binaries you get are particularly well tuned for any of them. You download the latest copy of Mozilla,and it’ll run on pretty much any Intel architecture CPU. There’s pretty much one Linux binary. It’s pretty generic, and it’s compiled with GCC, which is not a very good C compiler.
When HotSpot runs, it knows exactly what chipset you’re running on. It knows exactly how the cache works. It knows exactly how the memory hierarchy works. It knows exactly how all the pipeline interlocks work in the CPU. It knows what instruction set extensions this chip has got. It optimizes for precisely what machine you’re on. Then the other half of it is that it actually sees the application as it’s running. It’s able to have statistics that know which things are important. It’s able to inline things that a C compiler could never do. The kind of stuff that gets inlined in the Java world is pretty amazing. Then you tack onto that the way the storage management works with the modern garbage collectors. With a modern garbage collector, storage allocation is extremely fast.
The real killer for any AOT compiler is:
Class.forName(...)
This means that you cannot write a AOT compiler which covers ALL Java programs as there is information available only at runtime about the characteristics of the program. You can, however, do it on a subset of Java which is what I believe that gcj does.
Another typical example is the ability of a JIT to inline methods like getX() directly in the calling methods if it is found that it is safe to do so, and undoing it if appropriate, even if not explicitly helped by the programmer by telling that a method is final. The JIT can see that in the running program a given method is not overriden and is therefore in this instance can be treated as final. This might be different in the next invocation.
Edit 2019: Oracle has introduced GraalVM which allows AOT compilation on a subset of Java (a quite large one, but still a subset) with the primary requirement that all code is available at compile time. This allows for millisecond startup time of web containers.
Java's JIT compiler is also lazy and adaptive.
Lazy
Being lazy it only compiles methods when it gets to them instead of compiling the whole program (very useful if you don't use part of a program). Class loading actually helps make the JIT faster by allowing it to ignore classes it hasn't come across yet.
Adaptive
Being adaptive it emits a quick and dirty version of the machine code first and then only goes back and does a through job if that method is used frequently.
In the end it boils down to the fact that having more information enables better optimizations. In this case, the JIT has more information about the actual machine the code is running on (as Andrew mentioned) and it also has a lot of runtime information that is not available during compilation.
In theory, a JIT compiler has an advantage over AOT if it has enough time and computational resources available. For instance, if you have an enterprise app running for days and months on a multiprocessor server with plenty of RAM, the JIT compiler can produce better code than any AOT compiler.
Now, if you have a desktop app, things like fast startup and initial response time (where AOT shines) become more important, plus the computer may not have sufficient resources for the most advanced optimizations.
And if you have an embedded system with scarce resources, JIT has no chance against AOT.
However, the above was all theory. In practice, creating such an advanced JIT compiler is way more complicated than a decent AOT one. How about some practical evidence?
Java's ability to inline across virtual method boundaries and perform efficient interface dispatch requires runtime analysis before compiling - in other words it requires a JIT. Since all methods are virtual and interfaces are used "everywhere", it makes a big difference.
JITs can identify and eliminate some conditions which can only be known at runtime. A prime example is the elimination of virtual calls modern VMs use - e.g., when the JVM finds an invokevirtual
or invokeinterface
instruction, if only one class overriding the invoked method has been loaded, the VM can actually make that virtual call static and is thus able to inline it. To a C program, on the other hand, a function pointer is always a function pointer, and a call to it can't be inlined (in the general case, anyway).
Here's a situation where the JVM is able to inline a virtual call:
interface I {
I INSTANCE = Boolean.getBoolean("someCondition")? new A() : new B();
void doIt();
}
class A implements I {
void doIt(){ ... }
}
class B implements I {
void doIt(){ ... }
}
// later...
I.INSTANCE.doIt();
Assuming we don't go around creating A
or B
instances elsewhere and that someCondition
is set to true
, the JVM knows that the call to doIt()
always means A.doIt
, and can therefore avoid the method table lookup, and then inline the call. A similar construct in a non-JITted environment would not be inlinable.
I think the fact that the official Java compiler is a JIT compiler is a large part of this. How much time has been spent optimizing the JVM vs. a machine code compiler for Java?
Dimitry Leskov is absolutely right here.
All of the above is just theory of what could make JIT faster, implementing every scenaro is almost impossible. Besides, due to the fact that we only have a handful of different instruction sets on x86_64 CPUs there is very little to gain by targeting every instruction set on the current CPU. I always go by the rule of targeting x86_64 and SSE4.2 when building performance critical applications in native code. Java's fundamental structure is causing a ton of limitations, JNI can help you show just how inefficient it is, JIT is only sugarcoating this by making it overall faster. Besides the fact that every function by default is virtual, it also uses class types at runtime as opposed to for example C++. C++ has a great advantage here when it comes to performance, because no class object is required to be loaded at runtime, it's all blocks of data that gets allocated in memory, and only initialized when requested. In other words C++ doesn't have class types at runtime. Java classes are actual objects, not just templates. I'm not going to go into GC because that's irrelevant. Java strings are also slower because they use dynamic string pooling which would require runtime to do string searches in the pool table each time. Many of those things are due to the fact that Java wasn't first built to be fast, so its fundament will always be slow. Most native languages (primarily C/C++) was specifically built to be lean and mean, no waste of memory or resources. The first few versions of Java in fact were terribly slow and wasteful to memory, with lots of unnecessary meta data for variables and what not. As it is today, JIT being capable of producing faster code than AOT languages will remain a theory.
Think about all the work the JIT needs to keep track of to do the lazy JIT, increment a counter each time a function is called, check how many times it's been called.. so on and so forth. Running the JIT is taking a lot of time. The tradeof in my eyes is not worth it. This is just on PC
Ever tried to run Java on Raspberry and other embedded devices? Absolutely terrible performance. JavaFX on Raspberry? Not even functional... Java and its JIT is very far from meeting all of what it advertises and the theory people blindly spew out about it.
精彩评论