Numpy equivalent of list.index

2023-02-13 14:11 问答作者：

In a low-level function that is called many times, I need to do the equivalent of python's list.index, but with a numpy array. The function needs to return when it finds the first value, and raise ValueError otherwise. Something like:

>>> a = np.array([1, 2, 3])
>>> np_index(a, 1)
0
>>> np_index(a, 10)
Traceback (most recent call last):    
  File "<stdin>", line 1, in <module>
ValueError: 10 not in array

I want to avoid a Python loop if possible. np.where isn't an option as it always iterates through the entire array;开发者_开发百科 I need something that stops once the first index is found.

EDIT: Some more specific information related to the problem.

About 90% of the time, the index I'm searching for is in the first 1/4 to 1/2 of the array. So there's potentially a factor of 2-4 speedup at stake here. The other 10% of the time the value is not in the array at all.
I've profiled things already, and the call to np.where is the bottleneck, taking up at least 50% of the total runtime.
It is not essential that it raise a ValueError; it just has to return something that obviously indicates that the value isn't in the array.

I will probably code up a solution in Cython, as suggested.

See my comment on the OP's question for caveats, but in general, I would do the following:

import numpy as np
a = np.array([1, 2, 3])
np.min(np.nonzero(a == 2)[0])

if the value you are looking for is not in the array, you'll get a ValueError due to:

ValueError: zero-size array to ufunc.reduce without identity

because you are trying to take the min value of an empty array.

I would profile this code and see if it is an actual bottleneck, because in general when numpy searches through an entire array using a built-in function rather than an explicit python loop, it is relatively fast. An insistence on halting the search when it finds the first value may be functionally irrelevant.

If your numpy array is 1d array, maybe try like this:

a = np.array([1, 2, 3])
print a.tolist().index(2)
>>> 1

If is not 1d, you could search trough array like:

a = np.array([[1, 2, 3],[2,5,6],[0,0,2]])
print a[0,:].tolist().index(2)
>>> 1

print a[1,:].tolist().index(2)
>>> 0

print a[2,:].tolist().index(2)
>>> 2

The closest thing I could find to what you're asking for is nonzero. It may sound odd, but the documentation makes it look like it might have the desired result.

http://www.scipy.org/Numpy_Example_List_With_Doc#nonzero

Specifically this part:

a.nonzero()

Return the indices of the elements that are non-zero.

Refer to numpy.nonzero for full documentation.

See Also

numpy.nonzero : equivalent function

>>> from numpy import *
>>> y = array([1,3,5,7])
>>> indices = (y >= 5).nonzero()
>>> y[indices]
array([5, 7])
>>> nonzero(y)                                # function also exists
(array([0, 1, 2, 3]),)

Where (http://www.scipy.org/Numpy_Example_List_With_Doc#where) may also be of interest to you.

You can code it in Cython and just import from a Python script. There is no need to migrate your entire project into Cython.

# paste into: indexing.pyx
def index(long[:] lst, long value):
    cdef int i
    for i in range(len(lst)):
        if lst[i] == value:
            return i
    raise ValueError

# import in your .py code
import pyximport
pyximport.install()
from indexing import index

# example
from numpy import zeros
a = zeros(10**6, int)
a[-1] = 1

index(a, 1)
Wall time: 6.07 ms
999999

index(a, 0)
Wall time: 38.1 µs
0

NumPy's searchsorted is very similar to lists's index, except that it requires a sorted array and behaves more numerically. The big differences are that you don't need to have an exact match, and you can search starting from either the left or right sides. See the following examples to get an idea how it works:

import numpy as np
a = np.array([10, 20, 30])

a.searchsorted(-99) == a.searchsorted(0) == a.searchsorted(10)
# returns index 0 for value 10

a.searchsorted(20.1) == a.searchsorted(29.9) == a.searchsorted(30)
# returns index 2 for value 30

a.searchsorted(30.1) == a.searchsorted(99) == a.searchsorted(np.nan)
# returns index 3 for undefined value

With the last case, where an index of 3 is returned, you can handle this as you like. I gather from the name and intention of the function that it stops after finding the first suitable index.

The only time I've had this problem, it was sufficient to cast the numpy array as a list:

a = numpy.arange(3)
print(list(a).index(2))

>>> 2

继续阅读：numpy python

Numpy equivalent of list.index

See Also

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？