Java or Python for math? [closed]
I'm trying to write a pretty heavy duty math-based project, which will parse through about 100MB+ data several times a day, so, I need a fast language that's pretty easy to use. I would have gone with C, but, getting a large project done in C is very difficult, especially with the low level programming getting in your way. So, I was about python or java. Both are well equiped with OO features, so I don't mind that. Now, here are my pros for choosing python:
- Very easy to use language
- Has a pretty large library of useful stuff
- Has an easy to use plotting library
Here are the cons:
- Not exactly blazing
- There isn't a native python neural network library that is active
- I can't close source my code without going through quite a bit of trouble
- Deploying python code on clients computers is hard to deal with, especially when clients are idiots.
Here are the pros for choosing Java:
- Huge library
- Well supported
- Easy to deploy
- Pretty fast, possibly even comparable to C++
- The Encog Neural Network Library is really active and pretty awesome
- Networking support is really good
- Strong typing
Here are the cons for Java:
- I can't find a good graphing library like matplotlib for python
- No built in support for big integers, that means another dependency (I mean REALLY big integers, not just math.BigInteger size)
- File IO is kind of awkward compared to Python
- Not a ton of array manipulating or "make programming easy" type of features that python has.
So, I was hoping you guys can tell me what to use. I'm equally familiar with both languages. Also, suggestions for other languages is great too.
EDIT: WOW! you guys are fast! 30 mins at 10 responses!
Java will usually be quicker to run (don't take this as an absolute truth), but slower to write.
Python is the opposite. Since libraries such as SciPy and NumPy already exist, which are built upon fast C code, I'd suggest going with Python if you prefer to go the "speedier" way in terms of code writing. Unless fundamental blocks for your application are missing in SciPy + NumPy, and those exist for Java.
Why not get the best of both worlds by taking advantage of multiple languages on the JVM:
- Write the performance intensive parts in Java (or use existing great Java libraries)
- Use Jython to write the user interface / application in Python and call the Java code when needed
NumPy usually put quite some kick into the computational force of Python. It's the defacto standard for any real number crunching in python. I don't have any real experience with Java in this field so Im not really qualified to answer this question for you.
Deploying python code on clients computers is hard to deal with, especially when clients are idiots I thinks this is a problem with Java too.
I can't find a good graphing library like matplotlib for python Have you tried JFreechart http://www.jfree.org/jfreechart/
Also, suggestions for other languages is great too I would suggest Groovy, it looks a bit like Python and is a JVM language that integrates well with Java.
You did not ask this directly, but I will recommend you the Apache Commons Math library for Math Java computations.
If those are the choices, then Java should be the faster for math intensive work. It is compiled (although yes it is still running byte code).
Exelian mentions NumPy. There's also the SciPy package. Both are worth looking at but only really seem to give speed improvements for work with lots of arrays and vector processing. When I tried using these with NLTK for a math-intensive routine, I found there wasn't that much of a speedup.
For math intensive work these days, I'd be using C/C++ or C# (personally I prefer C# over Java although that shouldn't affect your decision). My first employer out of univ. paid me to use Fortran for stuff that is almost certainly more math intensive than anything you're thinking of. Don't laugh - the Fortran compilers are some of the best for math processing on heavy iron.
What is more important for you?
If it's rapid application development, I found Python significantly easier to code for than Java - and I was just learning Python, while I had been coding on Java for years.
If it's application speed and the ability to reuse existing code, then you should probably stick with Java. It's reasonably fast and many research efforts at the moment use Java as their language of choice.
It seems Java can be really fast: http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-python-ruby-jython-jruby-groovy/
On the other hand Python is very good for doing math, and there's quite a lot of room for performance improvement if you use it correctly (I mean, with the right idioms/modules/builtin functions).
Edit: Suggestions for other languages: Haskell. It's very high level; writing it in "low level style" it can be very fast (can compare quite fairly with C) and its even better if you can make some use of its multithreading capabilities. However experience tells its never good learning to use new tools while they're needed in a project.
The Apache Commons Math picked up where JAMA left off. They are quite capable for scientific computing.
So is Python - NumPy and SciPy are excellent. I also like the fact that Python is a hybrid of object-orientation and functional programming. Functional programming is awfully handy for numerical methods.
I'd recommend using the one that you know best, but if the choice is a toss up I might lean towards Python.
精彩评论