开发者

Java performance in numerical algorithms

I am curious about performance of Java numerical algorithms, say for example matrix matrix double precis开发者_JS百科ion multiplication, using the latest JIT machines as compared for example to hand tuned SSE C++/assembler or Fortran counterparts.

I have looked on the web but most of the results come from almost 10 years ago and I understand Java progressed quite a lot since then.

If you have experience using Java for numerically intensive applications can you share your experience. Also how well does Java perform in kernels where the loops are relatively short and the memory access is not very uniform but still within the limits of L1 cache? If such kernel is executed multiple times in succession, can JVM optimize it during runtime?

Thanks


I have written some reasonably large and performance sensitive numerical code in Java (crunching large arrays of doubles usually).

I've found Java to be "good enough" for fast numerical calculations. Especially when you consider that you are usually not CPU-bound anyway - memory latency and cache awareness will probably be your biggest problem for large datasets.

However, you can still beat Java with hand-optimized C/C++ code that takes advantage of specific vectorised instructions etc. or highly customised memory layouts. So for the very fastest code, you could consider writing the core algorithm in C/C++ and calling it from Java using JNI.

Personally, I find that creating a native code dependency is usually more trouble than it is worth so I tend to stick to the pure Java approach.


This is coming from a .NET side of things, but I'm 90% sure that it's the case for Java too. While the JIT will make some use of SSE instructions where it can, it currently does not auto-vectorize your code when dealing with, for example, matrix multiplications. Hand vectorized C++ using compiler intrinsics/inline assembly will definitely be faster here.


One of the weakest points in java is (native) matrix operations. This is due to the nature of Java matrices:

  • You can not declare a matrix to be rectangular, ie. each row can have a different number of columns.

  • A matrix is technically not a "matrix of doubles (or ints, ...)", but an array of arrays of ... . The big difference is that since arrays are Java objects you can assign the same array object to more than 1 row.

These two properties make a lot of standard matrix optimizations impossible for the compiler.

You might get better performance by using a Java library which emulates matrices on a single long array. However you have the overhead of method calls for all access.


C++ will definitely be faster. You can even have some hand-optimized libraries for your purposes that contain assembly codes for each of the major CPUs out there. You can't get better than that.

Afterwards, you can use JNI to call to it from Java, if needed.

Java is not meant for high performance arithmetic calculations like this. If you are depending on these, I'd recommend picking a proper, low-level language to implement that. Or, alternatively, you can write the performance-specific part in a low level language, and then connect it to a Java front-end using JNI or some other IPC method.


This is a link to the programming language shootout page for java vs c++, which will give you a comparison of java's speed on several compute intensive algorithms. It will also show you what highest performance java code looks like. For the most part, for these few specific benchmarks, java took more time (but not more than 2 or 3 times) to run.


Seconding that your best bet is to test it for yourself, as performance will vary somewhat depending on what you're doing exactly. I find it difficult to believe Shane C. Mason's answer that Java performance will be the same as C++ or Fortran performance, as even C++ and Fortran are not really comparable for some scientific computing algorithms.

I have a computational fluid dynamics code that I wrote using C++ and the same code essentially translated into Fortran. I'm not really sure why yet, but the Fortran version is about twice as fast as the C++ version. I would guess that with features like bounds-checking and garbage collection, Java would be slower than both, but I would not know until I tested.


This can be so dependent on what you are doing in the C++ code.

For example, are you using the GPU? Edit I forgot about jogl, so Java can compete here.

Are you parallelized using STM or shared-memory, then Java can't compete. For a link on analysis of parallel matrix multiplication: http://www.cs.utexas.edu/users/plapack/papers/ipps98/ipps98.html

Do you have enough memory to do the calculations in memory, so the garbage collector won't be needed, and have you fine-tuned the garbage collector for optimal performance? Then, Java can be competitive, perhaps.

Are you using multicores, and is the C++ optimized to take advantage of this architecture? Then Java won't be able to compete.

Are you using several computers tied together, then Java won't be able to compete.

Are you using any combination of these, then it will depend on the particular implementation.

Java is not designed to compete with a hand-tuned C++ program, but, the time it takes to do the tuning, are you doing enough calculations where it will matter? Java will be able to give some reasonable speed but with less work than hand-tuning, but not much of an improvement over just doing C++ code.

You may want to see if there is an improvement over Haskell or Erlang, for example, over your C++, as these languages are better designed for this type of work.


Are these kind of computations you're interested in - Fast Fourier Transform, Jacobi Successive Over Relaxation, Monte Carlo Integration, Sparse Matrix Mult, Dense LU Matrix Factorisation?

They make up the SciMark 2.0 composite benchmark which you can launch as an applet on your machine.

There are also ANSI C versions of the programs, and an Intel document (pdf) on optimizing and recompiling SciMark for C++.


Similarly you could use The Java Grande Forum Benchmark Suite and the comparison C programs.


Java uses a Just in Time (JIT) compiler to convert the bytecode to native machine language - so the first time it runs through a code block it will be slower but once the segment is 'warmed up' the performance will be equivalent. In short - the numerical performance is pretty good.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜