Efficient math operations on parts of "sparse" numpy arrays

2023-02-16 12:41 问答作者：

I have the following challenge in a simulation for my PhD thesis:

I need to optimize the following code:

开发者_如何学Gorepelling_forces = repelling_force_prefactor * np.exp(-(height_r_t/potential_steepness))

In this code snippet 'height_r_t' is a real Numpy array and 'potential_steepness' is an scalar. 'repelling_force_prefactor' is also a Numpy array, which is mostly ZERO, but ONE at pre-calculated position, which do NOT change during runtime (i.e. a Mask). Obviously the code is inefficient as it would make much more sense to only calculate the exponential function at the positions, where 'repelling_force_prefactor' is non-zero.

The question is how do I do this in the most efficient manner?

The only idea I have up to now would be to define slice to 'height_r_t' using 'repelling_force_prefactor' and apply 'np.exp' to those slices. However, I have made the experience that slicing is slow (not sure if this is generally correct) and the solution seems awkward.

Just as a side-note the ration of 1's to 0's in 'repelling_force_prefactor' is about 1/1000 and I am running this in loop, so efficiency is very important. (Comment: I wouldn't have a problem with resorting to Cython, as I will need/want to learn it at some point anyway... but I am a novice, so I'd need a good pointer/explanation.)

masked arrays are implemented exactly for your purposes.

Performance is the same as Sven's answer:

height_r_t = np.ma.masked_where(repelling_force_prefactor == 0, height_r_t)
repelling_forces = np.ma.exp(-(height_r_t/potential_steepness))

the advantage of masked arrays is that you do not have to slice and expand your array, the size is always the same, but numpy automatically knows not to compute the exp where the array is masked.

Also, you can sum array with different masks and the resulting array has the intersection of the masks.

Slicing is probably much faster than computing all the exponentials. Instead of using the mask repelling_force_prefactor for slicing directly, I suggest to precompute the indices where it is non-zero and use them for slicing:

# before the loop
indices = np.nonzero(repelling_force_prefactor)

# inside the loop
repelling_forces = np.exp(-(height_r_t[indices]/potential_steepness))

Now repelling_forces will contain only the results that are non-zero. If you have to update some array of the original shape of height_r_t with this values, you can use slicing with indices again, or use np.put() or a similar function.

Slicing with the list of indices will be more efficient than slicing with a boolean mask in this case, since the list of indices is shorter by a factor thousand. Actually measuring the performance is of course up to you.

继续阅读：arrays mask numpy python slice

Efficient math operations on parts of "sparse" numpy arrays

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？