开发者

Looking for the fastest way to find the exact overlap between two arrays of equal length in numpy

I am looking for the optimal (fastest) way to find the exact overlap between two arrays in numpy. Given two arrays x and y

x = array([1,0,3,0,5,0,7,4],dtype=int)
y = array([1,4,0,0,5,0,6,4],dtype=int)
开发者_运维知识库

What I want to get is, an array of the same length that contains only the numbers from both vectors that are equal:

array([1,0,0,0,5,0,0,4])

First I tried

x&y
array([1,0,0,0,5,0,6,4])

Then I realised that this is always true for two numbers if they are > 0.


result = numpy.where(x == y, x, 0)

Have a look at numpy.where documentation for explanation. Basically, numpy.where(a, b, c), for a condition a returns an array of shape a, and with values from b or c, depending upon whether the corresponding element of a is true or not. b or c can be scalars.

By the way, x & y is not necessarily "always true" for two positive numbers. It does bitwise-and for elements in x and y:

x = numpy.array([2**p for p in xrange(10)])
# x is [  1   2   4   8  16  32  64 128 256 512]
y = x - 1
# y is [  0   1   3   7  15  31  63 127 255 511]
x & y
# result: [0 0 0 0 0 0 0 0 0 0]

This is because the bitwise representation of each element in x is of the form 1 followed by n zeros, and the corresponding element in y is n 1s. In general, for two non-zero numbers a and b, a & b may equal zero, or non-zero but not necessarily equal to either a or b.


Using numpy.where is the most general solution. but in this particular case, and because it is a useful programming practice, you could use x==y as a mask:

mask = x==y  
# mask is  array([ True, False, False,  True,  True,  True, False,  True], dtype=bool)

xf = mask * x
# xf is array([1, 0, 0, 0, 5, 0, 0, 4])

or directly

xf = (x==y) * x

imagine now some data X (e.g. 1D for sound, 2D for an image, 3D for a movie, etc...)

(X<1) * -1. + (X>1) * 1.

returns data with values -1 for an amplitude inferior to 1 and 1. otherwise.


try numpy.in1d... from the documentation....

Test whether each element of a 1D array is also present in a second array.

Returns a boolean array the same length as ar1 that is True where an element of ar1 is in ar2 and False otherwise.

Parameters

ar1 : array_like, shape (M,) Input array. ar2 : array_like The values against which to test each value of ar1. assume_unique : bool, optional If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.

Returns

mask : ndarray of bools, shape(M,) The values ar1[mask] are in ar2.

See Also

numpy.lib.arraysetops : Module with a number of other functions for performing set operations on arrays.

Notes

in1d can be considered as an element-wise function version of the python keyword in, for 1D sequences. in1d(a, b) is roughly equivalent to np.array([item in b for item in a]).

.. versionadded:: 1.4.0

Examples

test = np.array([0, 1, 2, 5, 0])
states = [0, 2]
mask = np.in1d(test, states)
mask
    array([ True, False,  True, False,  True], dtype=bool)
test[mask]
    array([0, 2, 0])
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜