averaging matrix efficiently

2023-01-02 03:18 问答作者：

in Python, given an n x p matrix, e.g. 4 x 4, how can I return a matrix that's 4 x 2 that simply averages the first two columns and the last two columns for all 4 rows of the matrix?

e.g. given:

a = array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])

return a matrix that has the average of a[:, 0] and a[:, 1] and the average of a[:, 2] and a[:, 3]. I want this to work for an arbitrary matrix of n x p assuming that the number of columns I am averaging of n is obviously evenly divisible by n.

let me clarify: for each row, I want to take the average of the first two columns, then the average of the last two columns. So it would be:

1 + 2 / 2, 3 + 4 / 2 <- row 1 of new matrix 5 + 6 / 2, 7 + 8 / 2 <- row 2 of new matrix, etc.

which should yield a 4 by 2 matrix rather than 4 x 4.

th开发者_如何学编程anks.

How about using some math? You can define a matrix M = [[0.5,0],[0.5,0],[0,0.5],[0,0.5]] so that A*M is what you want.

from numpy import array, matrix

A = array([[1, 2, 3, 4], 
           [5, 6, 7, 8], 
           [9, 10, 11, 12], 
           [13, 14, 15, 16]])
M = matrix([[0.5,0],
            [0.5,0],
            [0,0.5],
            [0,0.5]])
print A*M

Generating M is pretty simple too, entries are 1/n or zero.

reshape - get mean - reshape

>>> a.reshape(-1, a.shape[1]//2).mean(1).reshape(a.shape[0],-1)
array([[  1.5,   3.5],
       [  5.5,   7.5],
       [  9.5,  11.5],
       [ 13.5,  15.5]])

is supposed to work for any array size, and reshape doesn't make a copy.

It's a bit unclear what should happen for matrices with n > 4, but this code will do what you want:

a = N.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]], dtype=float)
avg = N.vstack((N.average(a[:,0:2], axis=1), N.average(a[:,2:4], axis=1))).T

This yields avg =

array([[  1.5,   3.5],
       [  5.5,   7.5],
       [  9.5,  11.5],
       [ 13.5,  15.5]])

Here's a way to do it. You only need to change groupsize to make it work with other sizes like you said, though I'm not fully sure what you want.

groupsize = 2
out = np.hstack([np.mean(x,axis=1,out=np.zeros((a.shape[0],1))) for x in np.hsplit(a,groupsize)])

yields

array([[  1.5,   3.5],
   [  5.5,   7.5],
   [  9.5,  11.5],
   [ 13.5,  15.5]])

for out. Hopefully it gives you some ideas on how to do exactly what it is that you want to do. You can make groupsize dependent on the dimensions of a for instance.

继续阅读：numpy python scipy

averaging matrix efficiently

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？