Is Java really slow?
Java has some degree of reputation for being slow.
- Is Java really slow?
- If yes, why? Where is (or was) the bottleneck? Is it because of inefficient JVMs? Garbage collection? Pure bytecode librar开发者_如何学Pythonies instead of JNI-wrapped C code? Many other languages have these features, but they don't have this reputation for slowness.
Modern Java is one of the fastest languages, even though it is still a memory hog. Java had a reputation for being slow because it used to take a long time for the VM to start up.
If you still think Java is slow, see the benchmarks game results. Tightly optimized code written in a ahead-of-time compiled language (C, Fortran, etc.) can beat it; however, Java can be more than 10x as fast as PHP, Ruby, Python, etc. There are specific areas where it can beat common compiled languages (if they use standard libraries).
There is no excuse for "slow" Java applications now. Developers and legacy code/libraries are to blame, far more than the language. Also, blame anything 'enterprise.'
In fairness to the "Java is slow" crowd, here are areas where it is still slow (updated for 2013):
Libraries are often written for "correctness" and readability, not performance. In my opinion, this is the main reason Java still has a bad reputation, especially server-side. This makes the String problems exponentially worse. Some simple mistakes are common: objects are often used in place of primitives, reducing performance and increasing memory use. Many Java libraries (including the standard ones) will create Strings frequently, rather than reusing mutable or simpler formats (char[] or StringBuffer). This is slow and creates tons of garbage to collect later. To fix this, I suggest developers use primitive collections and especially Javalution's libraries, where possible.
String operations are a bit slow. Java uses immutable, UTF-16-encoded string objects. This means you need more memory, more memory access, and some operations are more complex than with ASCII (C, C++). At the time, it was the right decision for portability, but it carries a small performance cost. UTF-8 looks like a better choice now.
Array access is a bit slower compared to C, due to bounds checks. The penalty used to be big, but is now small (Java 7 optimizes away a lot of redundant bounds checks).
Lack of arbitrary memory access can make some I/O and bit-level processing slow (compression/decompression for example). This is a safety feature of most high-level languages now.
Java uses a LOT more memory than C, and if your application is memory bound or memory bandwidth bound (caching, etc.) this makes it slower. The flipside is that allocation/deallocation is blazing fast (highly optimized). This is a feature of most high-level languages now, and due to objects and use of GC rather than explicit memory allocation. Plus bad library decisions.
Streams-based I/O is slow due to the (IMO, poor choice) to require synchronization on each stream access. NIO fixed this, but it is a pain to use. One can work around this by doing read/write to an array, instead of an element at a time.
Java doesn't provide the same low-level functionality C does, so you can't use dirty inline assembler tricks to make some operations faster. This provides portability and is a feature of most high-level languages now.
It is common to see Java applications tied to very old JVM versions. Especially server-side. These old JVMs can be incredibly inefficient, compared to the latest versions.
In the end, Java was designed to provide security and portability at the expense of some performance, and for some really demanding operations it shows. Most of its reputation for slowness is no longer deserved.
However, there are several places where Java is faster than most other languages:
Memory allocation and de-allocation are fast and cheap. I've seen cases where it is 20% FASTER (or more!) to allocate a new, multi-kB array than to reuse a cached one.
Object instantiation and object-oriented features are blazing fast to use (faster than C++ in some cases), because they're designed in from the beginning. This is partially from good GC rather than explicit allocation (which is more friendly to lots of small object allocations). One can code C that beats this (by rolling custom memory management and doing malloc efficiently), but it is not easy.
Method calls are basically free and in some cases faster than large-method code. The HotSpot compiler uses execution information to optimize method calls and has very efficient inlining. By using the additional execution information, it can sometimes outperform ahead-of-time compilers and even (in rare cases) manual inlining. Compare to C/C++ where method calls come with a small performance penalty if compiler decides not to inline.
Synchronization and multi-threading are easy and efficient. Java was designed to be thread-aware from the beginning, and it shows. Modern computers usually feature multiple cores, and because threading is built into the language, you can very easily take advantage. Basically an extra 100% to 300% speed boost vs. standard, single-threaded C code. Yes, carefully written C threading and libraries can beat this, but that's a lot of extra work for the programmer.
Strings include length: some operations are faster. This beats using null-delimited strings (common in C). In Java 7, Oracle took out the String.subString() optimization, because people were using it stupidly and getting memory leaks.
Array copy is highly optimized. In the lastest versions, Java uses hand-tuned assembler for System.arraycopy. The result is that in arraycopy/memcopy-heavy operations, I've seen my code beat the equivalent in C by reasonable margins.
The JIT compiler is smart about using L1/L2 cache. Ahead-of-time compiled programs can't tweak their code in real-time to the specific CPU & system they're running on. JIT provides some very efficient loop transformations this way.
A couple of other historical facts contributed to the "Java is slow" reputation:
- Before JIT compilation (Java 1.2/1.3), the language was only interpreted, not compiled, and thus very slow.
- JIT compilation took time to become efficient (major improvements with each version)
- Classloading has become a lot more efficient over the years. It used to be quite inefficient and slow during startup.
- Swing and UI code did not use native graphics hardware very well.
- Swing is just awful. I blame AWT and Swing for why Java never caught on for the desktop.
- Heavy use of synchronization in library classes; unsynchronized versions are now available
- Applets take forever to load, because of transmitting a full JAR over the network and loading the VM to boot.
- Synchronization used to carry a heavy performance penalty (this has been optimized with each Java version). Reflection is still costly, though.
Initially Java was not particularly fast, but it is not overly slow either. These days, Java is very fast. From the people I've talked to the impression of Java being slow comes from two things:
Slow VM startup time. The early Java implementation took a long time to start up and load the require libraries and the application compared to native applications.
Slow UI. Early Swing was slow. It probably did not help that most Windows users found the default Metal L&F ugly either.
Given the above points, it's no wonder people got the 'Java is slow' impression.
For users or developers used to developing native applications, or even Visual Basic applications, these two points are the most visible thing in an application, and it is the first impression you will get about an application (unless it's a non-GUI application in which case only the 1. applies.).
You will not convince a user that "it executes code very fast" when the application takes 8 seconds to start vs. his old Visual Basic application that starts immediately - even though code execution and startup time might not be connected at all.
Ruining the first impression is a great way to start rumors and myths. And rumors and myths are hard to kill.
In short, Java is not slow. People having the "Java is slow attitude" is based on first impressions of Java more than 10 years ago.
After reading a page full of comments saying Java is not slow, I just have to answer with a differing opinion.
Slowness of a language is a lot dependent on what your expectations are for 'fast'. If you consider C# to be fast, Java surely is fast too. If your problem domain is related to databases, or semi real-time processing, Java is surely fast enough too. If you are happy to scale your application by adding more hardware, Java is likely fast for you. If you consider that a constant factor speedup in the scale of 5-10 isn't ofter worth it, you likely consider Java fast.
If you do numerical computation on large data sets, or are bound to an execution environment, where CPU resources are limited, where a constant speedup in the scale of 5-10 would be huge. Even a 0.5 speed up might mean a 500 hour reduction for the computation to complete. In these cases, Java just does not allow you to get that last yard of performance, and you'd likely consider Java to be slow.
You seem to be asking two rather different questions:
- Is Java really slow, and if so why?
- Why is Java perceived as slow, even though it's faster than many alternatives?
The first of these is more or less a "how long is a rope" kind of question. It comes down to your definition of "slow". Compared to a pure interpreter, Java is extremely fast. Compared to other languages that are (normally) compiled to some sort of bytecode, then dynamically compiled to machine code (e.g. C# or anything else on .NET) Java is roughly on a par. Compared to the languages that are normally compiled to pure machine code, and have (often large) teams of people working on nothing but improving their optimizers (e.g. C, C++, Fortran, Ada) Java does pretty well at a few things, but overall tends to be at least somewhat slower.
A lot of this is related primarily to the implementation -- basically, it comes down to the fact that a user is waiting while a dynamic/JIT compiler runs, so unless you have a program that runs for quite a while to start with, it's hard to justify having the compiler spend a lot of time on difficult optimizations. Therefore, most Java (and C#, etc.) compilers don't put a lot of effort into really difficult optimizations. In a lot of cases, it's less about what optimizations are done, than where they're applied. Many optimization problems are NP complete, so the time they take grows quickly with the size of problem being attacked. One way to keep time within reason is to only apply the optimization to something like a single function at a time. When it's only the developer waiting for the compiler, you can afford to take a lot longer, and apply that same optimization to much larger chunks of the program. Likewise, the code for some optimizations is pretty hairy (and therefore can be pretty big). Again, since the user is waiting while that code loads (and the JVM startup time is often a significant factor in the overall time), the implementation has to balance time saved in one place vs. lost in another -- and given how little code benefits from the hairy optimizations, keeping the JVM small is usually more beneficial.
A second problem is that with Java, you frequently get a more or less "one size fits all" kind of solution. Just for example, for many Java developers Swing is essentially the only windowing library available. In something like C++, there are literally dozens of windowing libraries, application frameworks, etc., each with its own set of compromises between ease of use vs. fast execution, consistent look and feel vs. native look and feel, and so on. The only real sticking point is that some (e.g. Qt) can be quite expensive (at least for commercial use).
Third a lot of code written in C++ (and C even more so) is simply older and more mature. At lot of it contains a core of routines written decades ago, when spending extra time optimizing the code was normal, expected behavior. That often has a real benefit in code that's smaller and faster. C++ (or C) gets the credit for the code being small and fast, but it's really much more a product of the developer and the constraints of the time the code was written. To an extent, this leads to a self-fulfilling prophecy -- when people care about speed, they often select C++ because it has that reputation. They put extra time and effort into optimization, and a new generation of fast C++ code is written.
To summarize, the normal implementation of Java makes maximum optimization problematic at best. Worse, where Java is visible, such things as windowing toolkits and JVM startup time often play a larger role than the execution speed of the language itself anyway. In a lot of cases, C and C++ also get credit for what's really the product of simply working harder at optimization.
As to the second question, I think it's largely a matter of human nature at work. A few zealots make rather inflated claims about Java being blindingly fast. Somebody tries it out, and finds that even a trivial program takes a few seconds to get started, and feels slow and clumsy when it does run. Few probably bother analyzing things to realize that a great deal of this is the startup time of the JVM, and the fact that when they first try things out, none of the code has been compiled yet -- some of the code is being interpreted, and some being compiled while they wait. Worse, even when it runs fast enough, the look and feel will usually seem foreign and clumsy to most users, so even if objective measurements showed fast response times, it'd still seem clumsy.
Adding those together leads to a fairly simple and natural reaction: that Java is slow, ugly and clumsy. Given the hype saying it's really fast, there's a tendency to overreact and conclude think of it as horribly slow, instead of a (more accurate) "slightly slower, and that mostly under specific circumstances." This is generally at its worst for a developer writing the first few programs in the language. Execution of a "hello world" program in most languages appears instantaneous, but in Java there's an easily perceptible pause as the JVM starts up. Even a pure interpreter that runs much more slowly on tight loops and such will still often appear faster for code like this, simply because it can get loaded and started executing a bit sooner.
It's out-dated information from the early days (mid-to-late 1990s) of Java. Every major version of Java has introduced significant speed-ups compared to the previous version. With Oracle apparently merging JRockit with Sun's JVM for Java 7, this trend looks set to continue.
Compared to many other popular modern languages (Python, Ruby, PHP), Java is actually significantly faster for most uses. It doesn't quite match C or C++ but for many tasks it's close enough. The real performance concerns ought to be about how much memory it ends up using.
The main culprit in the "long startup time" is dynamic linking. A Java application consists of compiled classes. Each class references other classes (for argument types, method invocations...) by name. The JVM must examine and match those names upon startup. It does so incrementally, doing only the parts that it needs at any given time, but that is still some work to do.
In a C application, that linking phase occurs at the end of the compilation. It is slow, especially for big applications, but only the developer sees it. Linking yields an executable file which the OS simply has to load in RAM "as is".
In Java, linking occurs every single time that the application is run. Hence the long startup time.
Various optimizations have been applied, including caching techniques, and computers get faster (and they get "more faster" than applications get "more bigger"), so the problem importance has much reduced lately; but the old prejudice remains.
As for performance afterwards, my own benchmarks on compact computations with array accesses (mostly hash functions and other cryptographic algorithms) usually show that optimized C code is about 3x faster than Java code; sometimes C is only 30% faster than Java, sometimes C can be 4x faster, depending on the implemented algorithm. I saw a 10x factor when the "C" code was actually assembly for big integer arithmetics, due to the 64x64->128 multiplication opcodes that the processor offers but Java cannot use because its longest integer type is the 64-bit long
. This is an edge case. Under practical conditions, I/O and memory bandwidth considerations prevent C code from being really three times faster than Java.
Java is definitely slow, especially for quantitative work.
I use a combination of R, Python and C/C++ with optimised multithreaded ATLAS libraries. In each of these languages I can matrix multiply a 3000 by 3000 matrix of doubles with itself in around 4 seconds. Using Colt and Parallel Colt in Java, the same operation take 185 seconds! Astonishing despite these java libraries being parallel in nature.
So all in all, pure Java is unsuitable for quantitative work. Jblas seems to be the best linear algebra library for Java as it uses ATLAS.
My machine is an HP Core 2 Duo with 3 GB of RAM. I use 64-bit Ubuntu 10.04 (Lucid Lynx).
For most people's experience interacting with it - Java is slow. We've all seen that coffee cup spinning around on our browser before some applet comes up. It takes a while to spin up the JVM and download the applet binaries, and that impacts the user experience in a way that is noticed.
It doesn't help that the slow JVM spin-up and applet download time is conspicuously branded with a Java coffee cup, so people associate the wait with Java. When Flash takes a long time to load, the branding of the "loading" message is specified by the Flash developer, so people don't blame Flash technology as a whole.
All of this has nothing to do with Java's performance on a server, or in the many other ways that Java gets used outside the browser. But it's what people see, and what non-Java developers remember when they think about Java.
Java has the reputation of being slow because it was slow. The first versions of Java had either no or rather poor Just In Time compilation. This meant that the code, although bytecode, was being interpreted, so even for the simplest operations (like adding two integers) the machine had to do all sorts of compares and pointer dereferences and function calls. The JIT compiler has been ever-improving; now it's at the point where if I write C++ code carelessly and Java code carelessly, Java will sometimes outperform C++ because the JIT compiler realizes that I've got some unnecessary pointer dereferencing and will take care of it for me.
If you want to see just how big a difference JIT compilation makes, check out the interpreted vs. non-interpreted benchmarks at the Computer Languages Benchmark Game. (Pidigits uses an external library to do all the computations, so that benchmark doesn't change; the others show a 6-16x speedup!)
So, that's the main reason. There are a variety of other, lesser reasons that did not help matters: originally, Java startup time was slow (now fixed); web apps in Java take a long time to download (much less true now with widely accessible broadband, and with large things like movies being expected); the UI Swing was not (and still is not) written with performance in mind so it is much less snappy than equivalents in e.g. C++.
Java WAS slow, back in the day. It has become much faster, due to a few generations of performance enhancements. Last I heard, it is usually within 10% of C# speed -- sometimes faster, sometimes slower.
Java applet startup is still slow because you've got to start a whole JVM, which has to load all it's classes. Somewhat like booting another computer. Once the JVM starts it is quite fast, but the startup is usually what people remember.
Also, there are at least a few people that will never believe in the viability of Java.
Stefano:
I have been with Java since the beginning, so from my point of view the fame of being slow was created by non-responsive and slow GUI frontends (AWT, and then Swing) and in Applets probably because of the additional slow startup times of the VM's.
Java has stipulated and promoted a lot of research in the VM area, and there have been quite some improvements, including the garbage collection (you can tune a lot of things actually; however, often I see systems where only defaults are used) and hotspot optimization (which at the beginning and probably still is more efficient on the server side).
Java at the backend and the computational level is not that slow. Colt is one of the best examples:
The latest stable Colt release breaks the 1.9 Gflop/s barrier on JDK ibm-1.4.1, RedHat 9.0, 2x IntelXeon@2.8 GHz.
There are many things outside mainstream Java that should be considered, like Realtime Java or special mechanisms to enhance the speed like Javolution, as well as Ahead-Of-Time compilation (like gcj). Also, there are IC's that can execute Java Bytecode directly, like for example the one that is in the current iPhones and iPods ARM Jazelle.
I think that generally today it's a political decision (like no Java support on the iPhone/iPod), and a decision against Java as a language (because many think it is too verbose).
However, there are many other languages for the Java VM nowadays (e.g. Python, Ruby, JavaScript, Groovy, Scala etc.) which may be an alternative.
Personally I continue to enjoy it as a flexible and reliable platform, with excellent tooling and library availability, that allows one to work with everything from the smallest device (e.g. JavaCard) to the largest servers.
A hammer is much slower at rolling out dough than many other tools. Doesn't make the hammer "slower", nor less useful for the tasks that it is designed to do.
As a general programming language, Java is on par with many (if not most) for a wide array of programming tasks. There are specific, trivial tests for which Java will not outperform hand-coded solutions in less sophisticated languages, ones that are "closer to the metal".
But when it comes to "real world applications", Java often is the Right Tool. Now, that said, nothing will stop developers from making a slow-performing solution using ANY tool. Misuse of tool is a well known problem (just look at PHP's and VB's reputations). However, Java's (mostly) clean design and syntax does do a lot to reduce misuse.
Java is a high-level language and its reputation nowadays is to have performance on par with other, comparable high-level languages.
It has dynamic binding semantics. Compared to C++ where non-virtual methods are compiled as function calls, even the best Java compiler in the world has to produce code that is less efficient. But it's also a cleaner, more high-level semantic.
I do not remember the details, but I heard in the early days of Java that there was a mutex per Java object, to be acquired and released by each method. That tends to make it better adapted to concurrency, although unfortunately just a mutex per object will not protect you from races or deadlocks or any of the bad things that can happen in concurrent programs. That part, if true, is a little naive, but it came from good intentions. Feel free to fill me in on the details if you know more about this aspect.
Another way in which Java is a high-level language is by having Garbage-Collection. Garbage-Collection may be slower than
malloc
andfree
for programs that allocate at once all the memory they need and work with that. The problem is, in languages that do not have Garbage-Collection, programmers tend to write only programs that allocate all the memory they need at once and fail if it turns out some arbitrary maximum size constant has been overflown. So the comparison is apples to oranges. When programmers make the effort to write and debug programs with dynamic allocation of chained structures in non-GC languages, they sometimes find that their programs are no longer faster than in a GC language, becausemalloc
andfree
are not free! They have overhead too... Plus, not having a GC forces to specify who frees what, and having to specify who frees what in turn sometime forces you to make copies — when several functions are going to need the data and it's not clear which will be using it last — whereas copying wouldn't have been necessary in a GC language.
In the mid-nineties when Java hit the mainstream, C++ was the dominant language, and the web was still fairly new. Also, JVMs and GC were relatively new concepts in mainstream development. The early JVMs were kind of slow (compared to C++ running on bare metal) and also suffered from sometimes long garbage collection pauses, which led to a reputation of Java being slow.
Many Java desktop apps (these times: things like Eclipse) have bad GUI responsiveness, probably due to the high memory consumption and the fact that classloader can do lots of IO. It's improving but was worse few years ago.
Many (most) people like to do generalizations so they say "Java is slow" because they perceive the apps are slow when they interact with them.
The major problem with java applications is that it is huge due to the large size of the stock runtime library. Huge programs fill a lot in memory and tend to swap, meaning they become slow.
The reason the Sun JVM is large is because it has a very good byte code interpreter which works by keeping track of a lot of things. That means much data, which means memory.
You may want to look at the jamvm virtual machine which is an reasonably fast interpreter (no native code) and very small. It even starts up fast.
As Pascal says, Java is on par with other high-level languages. However, as someone who worked with the original JVMs on Windows 98, at the time the level of abstraction provided by the Java virtual machine was, shall we say, painful.
Basically, it was software emulation with little or no optimization that we take for granted today in the JVM.
People normally trot out the "it's interpreted" line. Because once upon a time, it was, and bad press gets handed down by people who dumped Java as 'too slow' and never returned to test newer versions.
Or perhaps "people are idiots" is a better answer.
I think some day, maybe not in the too-near future, JIT-compiled languages will outperform compiled languages in any aspect (well, maybe not startup time/memory consumption) due to the fact that JIT-compilers can make heavy use of runtime behaviour and the platform they're running on.
精彩评论