Efficient math operations on parts of "sparse" numpy arrays
I have the following challenge in a simulation for my PhD thesis:
I need to optimize the following code:
开发者_如何学Gorepelling_forces = repelling_force_prefactor * np.exp(-(height_r_t/potential_steepness))
In this code snippet 'height_r_t' is a real Numpy array and 'potential_steepness' is an scalar. 'repelling_force_prefactor' is also a Numpy array, which is mostly ZERO, but ONE at pre-calculated position, which do NOT change during runtime (i.e. a Mask). Obviously the code is inefficient as it would make much more sense to only calculate the exponential function at the positions, where 'repelling_force_prefactor' is non-zero.
The question is how do I do this in the most efficient manner?
The only idea I have up to now would be to define slice to 'height_r_t' using 'repelling_force_prefactor' and apply 'np.exp' to those slices. However, I have made the experience that slicing is slow (not sure if this is generally correct) and the solution seems awkward.
Just as a side-note the ration of 1's to 0's in 'repelling_force_prefactor' is about 1/1000 and I am running this in loop, so efficiency is very important. (Comment: I wouldn't have a problem with resorting to Cython, as I will need/want to learn it at some point anyway... but I am a novice, so I'd need a good pointer/explanation.)
masked arrays are implemented exactly for your purposes.
Performance is the same as Sven's answer:
height_r_t = np.ma.masked_where(repelling_force_prefactor == 0, height_r_t)
repelling_forces = np.ma.exp(-(height_r_t/potential_steepness))
the advantage of masked arrays is that you do not have to slice and expand your array, the size is always the same, but numpy automatically knows not to compute the exp where the array is masked.
Also, you can sum array with different masks and the resulting array has the intersection of the masks.
Slicing is probably much faster than computing all the exponentials. Instead of using the mask repelling_force_prefactor
for slicing directly, I suggest to precompute the indices where it is non-zero and use them for slicing:
# before the loop
indices = np.nonzero(repelling_force_prefactor)
# inside the loop
repelling_forces = np.exp(-(height_r_t[indices]/potential_steepness))
Now repelling_forces
will contain only the results that are non-zero. If you have to update some array of the original shape of height_r_t
with this values, you can use slicing with indices
again, or use np.put()
or a similar function.
Slicing with the list of indices will be more efficient than slicing with a boolean mask in this case, since the list of indices is shorter by a factor thousand. Actually measuring the performance is of course up to you.
精彩评论