Looking for the fastest way to find the exact overlap between two arrays of equal length in numpy
I am looking for the optimal (fastest) way to find the exact overlap between two arrays in numpy. Given two arrays x and y
x = array([1,0,3,0,5,0,7,4],dtype=int)
y = array([1,4,0,0,5,0,6,4],dtype=int)
开发者_运维知识库
What I want to get is, an array of the same length that contains only the numbers from both vectors that are equal:
array([1,0,0,0,5,0,0,4])
First I tried
x&y
array([1,0,0,0,5,0,6,4])
Then I realised that this is always true for two numbers if they are > 0.
result = numpy.where(x == y, x, 0)
Have a look at numpy.where
documentation for explanation. Basically, numpy.where(a, b, c)
, for a condition a
returns an array of shape a
, and with values from b
or c
, depending upon whether the corresponding element of a
is true or not. b
or c
can be scalars.
By the way, x & y
is not necessarily "always true" for two positive numbers. It does bitwise-and for elements in x
and y
:
x = numpy.array([2**p for p in xrange(10)])
# x is [ 1 2 4 8 16 32 64 128 256 512]
y = x - 1
# y is [ 0 1 3 7 15 31 63 127 255 511]
x & y
# result: [0 0 0 0 0 0 0 0 0 0]
This is because the bitwise representation of each element in x
is of the form 1
followed by n
zeros, and the corresponding element in y
is n
1s. In general, for two non-zero numbers a
and b
, a & b
may equal zero, or non-zero but not necessarily equal to either a
or b
.
Using numpy.where
is the most general solution. but in this particular case, and because it is a useful programming practice, you could use x==y
as a mask:
mask = x==y
# mask is array([ True, False, False, True, True, True, False, True], dtype=bool)
xf = mask * x
# xf is array([1, 0, 0, 0, 5, 0, 0, 4])
or directly
xf = (x==y) * x
imagine now some data X
(e.g. 1D for sound, 2D for an image, 3D for a movie, etc...)
(X<1) * -1. + (X>1) * 1.
returns data with values -1
for an amplitude inferior to 1 and 1.
otherwise.
try numpy.in1d... from the documentation....
Test whether each element of a 1D array is also present in a second array.
Returns a boolean array the same length as ar1
that is True
where an element of ar1
is in ar2
and False otherwise.
Parameters
ar1 : array_like, shape (M,)
Input array.
ar2 : array_like
The values against which to test each value of ar1
.
assume_unique : bool, optional
If True, the input arrays are both assumed to be unique, which
can speed up the calculation. Default is False.
Returns
mask : ndarray of bools, shape(M,)
The values ar1[mask]
are in ar2
.
See Also
numpy.lib.arraysetops : Module with a number of other functions for performing set operations on arrays.
Notes
in1d
can be considered as an element-wise function version of the
python keyword in
, for 1D sequences. in1d(a, b)
is roughly
equivalent to np.array([item in b for item in a])
.
.. versionadded:: 1.4.0
Examples
test = np.array([0, 1, 2, 5, 0])
states = [0, 2]
mask = np.in1d(test, states)
mask
array([ True, False, True, False, True], dtype=bool)
test[mask]
array([0, 2, 0])
精彩评论