开发者

Efficient iteration over 3D array?

I am using Python and Numpy to do some data analysis.

I have a large 3D matrix (NxNxN), where each cell is again a matrix, this time a 3x3 matrix. Calling the matrix data, it looks like this:

data[N,N,N,3,3]  

I need to find the eigenvalues of all the 3x3 matrices, and for that I use Numpy's eigvals routine, but it takes ages to do. Right now I pretty much do this:

for i in range(N):
    for j in range(N):
        for k in range(N):
            a = np.linalg.eigvals(data[i,j,k,:,:])

For N开发者_开发知识库 = 256, this takes about an hour. Any ideas on how to make this more efficient?

Many thanks for any suggestions!


itertools.product is nicer than nested loops, aesthetically speaking. But I don't think it will make your code that much faster. My testing suggests that iteration is not your bottleneck.

>>> bigdata = numpy.arange(256 * 256 * 256 * 3 * 3).reshape(256, 256, 256, 3, 3)
>>> %timeit numpy.linalg.eigvals(bigdata[100, 100, 100, :, :])
10000 loops, best of 3: 52.6 us per loop

So underestimating:

>>> .000052 * 256 * 256 * 256 / 60
14.540253866666665

That's 14 minutes minimum on my computer, which is pretty new. Let's see how long the loops take...

>>> def just_loops(N):
...     for i in xrange(N):
...         for j in xrange(N):
...             for k in xrange(N):
...                 pass
... 
>>> %timeit just_loops(256)
1 loops, best of 3: 350 ms per loop

Orders of magnitude smaller, as DSM said. Even the work of slicing the array alone is more substantial:

>>> def slice_loops(N, data):
...     for i in xrange(N):
...         for j in xrange(N):
...             for k in xrange(N):
...                 data[i, j, k, :, :]
... 
>>> %timeit slice_loops(256, bigdata)
1 loops, best of 3: 33.5 s per loop


I'm sure there is a good way to do this in NumPy, but in general, itertools.product is faster than nested loops over ranges.

from itertools import product

for i, j, k in product(xrange(N), xrange(N), xrange(N)):
    a = np.linalg.eigvals(data[i,j,k,:,:])


since all the calculation are independent, you can use multiprocessing module to speed up the calculation if you have a multi-core processor.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜