How to improve speed of odeint in Python?
I am using Python and odeint from the scipy package to solve a large number (~10e6) coupled ODE's. The system of equations can be formulated as a sum of some matrix multiplications and I use numpy with blas support for this. My problem is that this takes a very long time. When I profile the code I see that most of the time goes into odeint doing something else than evaluating the rhs. This is the five most time consuming calls from the profiler:
ncalls tottime percall cumtime percall filename:lineno(function)
5 1547.915 309.583 1588.170 317.634 {scipy.integrate._odepack.odeint}
60597 11.535 0.000 23.751 0.000 terms3D.py:5(two_body_evolution)
121194 11.242 0.000 11.242 0.000 {numpy.core._dotblas.dot}
60597 10.145 0.000 15.460 0.000 generator.py:13(Gs2)
121203 3.615 0.000 3.615 0.000 {method 'repeat' of 'numpy.ndarray' objects}
The rhs consists basicly of two_body_evolution and Gs2. This profile is for ~7000 coupled ODE's and here is the same thing for ~4000:
ncalls tottime percall cumtime percall filename:lineno(function)
5 259.427 51.885 273.316 54.663 {scipy.integrate._odepack.odeint}
30832 3开发者_如何学编程.809 0.000 7.864 0.000 terms3D.py:5(two_body_evolution)
61664 3.650 0.000 3.650 0.000 {numpy.core._dotblas.dot}
30832 3.464 0.000 5.637 0.000 generator.py:13(Gs2)
61673 1.280 0.000 1.280 0.000 {method 'repeat' of 'numpy.ndarray' objects}
So my main problem here is that the "hidden" time in odeint scales horribly with the number of equations. Do you have any ideas why that is and how to improve the performace?
Thank you for your time
Oscar Åkerlund
This is at least one possible source of the amount of time:
If you do not supply a Jacobian to odeint
(i.e. LSODA), it will try to compute it via finite differences. Moreover, it may try to invert the Jacobian, which scales as O(m^3), if it thinks the problem is stiff. Both of these steps are expensive when the number of variables is large.
You can try to reduce the time taken by these operations by forcing odeint
to use a banded Jacobian, by passing in suitable values for the ml
and mu
parameters to the routine. You don't need to supply a Dfun, these parameters also apply to the jacobian computed by differentiation.
Ten million equations is a non-trivial number.
Why do you say it "scales horribly"? Matrix addition is O(m^2)
for m x m matricies, and multiplication is O(m^3)
. You cite two points of wall time versus # of equations/degrees of freedom, but that only can describe a straight line. I'd pick a couple of intermediate points between 4K and 10M and see if the Big-Oh notation shows you how it scales. Fit the results to a 3rd order polynomial of wall time versus DOF; that'll tell you how things scale.
Are you equations linear or non-linear? Static or transient? Depending on the type of problem, you might be able to play with some other parameters like timestep size, convergence criteria, integration scheme choice, etc.
精彩评论