mask a 2D numpy array based on values in one column
Suppose I have the following numpy array:
a = [[1, 5, 6],
[2, 4, 1],
[3, 1, 5]]
I want to mask all the rows which have 1
in the first column. That is, I want
[[--, --, --],
[2, 4, 1],
[3, 1, 5]]
Is this p开发者_高级运维ossible to do using numpy masked array operations? How can one do it?
Thanks.
import numpy as np
a = np.array([[1, 5, 6],
[2, 4, 1],
[3, 1, 5]])
np.ma.MaskedArray(a, mask=(np.ones_like(a)*(a[:,0]==1)).T)
# Returns:
masked_array(data =
[[-- -- --]
[2 4 1]
[3 1 5]],
mask =
[[ True True True]
[False False False]
[False False False]])
You can create the desired mask by
mask = numpy.repeat(a[:,0]==1, a.shape[1])
and the masked array by
masked_a = numpy.ma.array(a, mask=numpy.repeat(a[:,0]==1, a.shape[1]))
You could simply create an empty mask and then use numpy-broadcasting (like @eumiro showed) but using the element- and bitwise "or" operator |
:
>>> a = np.array([[1, 5, 6], [2, 4, 1], [3, 1, 5]])
>>> mask = np.zeros(a.shape, bool) | (a[:, 0] == 1)[:, None]
>>> np.ma.array(a, mask=mask)
masked_array(data =
[[-- -- --]
[2 4 1]
[3 1 5]],
mask =
[[ True True True]
[False False False]
[False False False]],
fill_value = 999999)
A bit further explanation:
>>> # select first column
>>> a[:, 0]
array([1, 2, 3])
>>> # where the first column is 1
>>> a[:, 0] == 1
array([ True, False, False], dtype=bool)
>>> # added dimension so that it correctly broadcasts to the empty mask
>>> (a[:, 0] == 1)[:, None]
array([[ True],
[False],
[False]], dtype=bool)
>>> # create the final mask
>>> np.zeros(a.shape, bool) | (a[:, 0] == 1)[:, None]
array([[ True, True, True],
[False, False, False],
[False, False, False]], dtype=bool)
One further advantage of this approach is that it doesn't need to use potentially expensive multiplications or np.repeat
so it should be quite fast.
精彩评论