Question on heavy mathematical calculations in Java
I am using Java in my project which does a lot of mathematical calculations. In the next iteration of the project, some more calculations will be added. From my knowledge of Java, I suspect that this will cause performance issues. Is it a wise decision to delegate the heavy calculations to a low level language like Fortran or C? I can f开发者_开发知识库ire native calls to communicate with the low level languages. Java will take the control once the calculations are performed by Fortran or C. Will this better the performance?
Be careful not to underestimate modern Java VMs. The first generation ones were awesomely slow, especially at floating-point arithmetic, but modern ones are very quick indeed.
Having said that, other options are probably going to be quicker.
To be sure, you should run some benchmarks. Don't assume one is going to faster than the other, get some concrete performance measurements and make your decision on that basis.
Also consider whether the extra performance (if any) of a "native" solution is worth the extra hassle of writing it and integrating it.
Integer and floating point math as such in Java are handed right down to the hardware, and the calculations as such are in principle not slower than in C, say, or FORTRAN. The library routines for stuff like transcendental functions (sin()
, sqrt()
, log()
, etc) are in fact implemented in C, so again there's no good reason to look to other libraries.
There's some information I wish your question gave us. You mention that there's a lot of calculation going on, and that's number crunching. But you don't tell us anything about how those numbers are organized and accessed. This is probably interesting and useful information. If you're using intricate object structures to hold your data, accessing those structures is going to take time. If your results create new objects, that's also expensive. If you use arrays, those are also objects. Multi-dimensional arrays in Java are arrays of arrays, and indexing through multiple dimensions may resolve to object references which are slower than in other languages. Though I don't have benchmarks to prove it, I suspect you might be better off to replace multi-dimensional arrays with one-dimensional arrays and a bit of "manual" index calculation. You are certainly better off using fixed-size arrays, perhaps dimensioned with a bit of slack, rather than creating and discarding new arrays for each calculation. Finally, many of the object oriented tricks to make your program's structure more "elegant" and "flexible" tend to introduce a lot of unnecessary object orientation with attendant slowdowns. Primitive but simple is usually faster.
A very simple optimization might be to simply use the -server
option of your JVM (if that's available) to get the benefit of more pre-compilation, if you're not already doing that.
I second other folks' recommendation that you profile your calculations, though, before you go blindly re-architecting your program. There may be bottlenecks in surprising places.
Can you think of making use of parallel algorithms. It may or may not be applicable in your case but thought of pointing it out.
It depends on two factors:
- You should remember that the JNI calls will have a cost. If you can have your whole calculation to C, that overhead can become negligible, and you may gain some performance. Otherwise, if you're going to go back and forth between C and Java, you don't stand much of a chance of improving your performance.
- Say you'll have a function f() that does the calculation. You should first determine if the performance you can get from C is indeed superior to that in Java. I vaguely recall some article benchmarking C and Java that actually claimed that Java did a better job in terms of mathematical calculations. But in any case I'd benchmark both - at least a subset of them.
You should also play with various VM parameters. Run your program in server mode, this way JIT will produce better code, experiment with different garbage collectors, turn on escape analysis (might be better to use JDK 7 with that).
This paper may help you tune your program to use the best of JVM.
If you decide to choose the native path, use JNA, it's just much easier, especially if all your calculations will be in one method call.
All the comments so far are excellent. I would agree that native code should be a last step.
Parallelizing would be worth investigating.
So would using another algorithm, depending on what you're doing.
No one has suggested that you profile the code yet to find out where the time is being spent. Before embarking on any modifications I'd get some data to find out exactly what's being done and where. It'll guide your decisions better than guessing.
I would suggest to wait and see... Implement in Java as you normally do and then measure the performance, is it good, acceptable or bad?
As Skaffman saids moderns JVMs have high performance and thanks to the JIT compiler they're in many cases faster than native C.
This article shows some comparisons between Java and C calculations, it's also an example that using tha latest JVM version is a good idea performance wise.
Beware of wrapper types. For performance you better use primitive types as boxing and unboxing takes cpu cycles and burdens the garbage collector. Especially collections of boxed values is a lot of overhead. For collections, performance-wise, primitive arrays or a bytebuffer is what I'd use.
精彩评论