Best way to handle list.index(might-not-exist) in python?
I have code which looks something like this:
thing_index = thing_list.index(thing)
otherfunction(thing_list, thing_index)
ok so that's simplified but you get the idea. Now thing
might not actually be in the list, in which case I want to pass -1 as thing_index
. In other languages this is what you'd expect index()
to return if it couldn't find the element. In fact it throws a ValueError
.
I could do this:
try:
thing_index = thing_list.index(thing)
except ValueError:
thing_index = -1
otherfunction(thing_list, thing_index)
But this feels dirty, plus I don't know if ValueError
could be raised for some other reason. I came up with the following solution based on generator functions, but it seems a little complex:
thing_index = ( [(i for i in xrange(len(thing_list)) if thing_list[i]==thing)] or [-1] )[0]
Is there a cleaner way to achieve the same thing? Let's assume the list 开发者_C百科isn't sorted.
There is nothing "dirty" about using try-except clause. This is the pythonic way. ValueError
will be raised by the .index
method only, because it's the only code you have there!
To answer the comment:
In Python, easier to ask forgiveness than to get permission philosophy is well established, and no index
will not raise this type of error for any other issues. Not that I can think of any.
thing_index = thing_list.index(elem) if elem in thing_list else -1
One line. Simple. No exceptions.
The dict
type has a get
function, where if the key doesn't exist in the dictionary, the 2nd argument to get
is the value that it should return. Similarly there is setdefault
, which returns the value in the dict
if the key exists, otherwise it sets the value according to your default parameter and then returns your default parameter.
You could extend the list
type to have a getindexdefault
method.
class SuperDuperList(list):
def getindexdefault(self, elem, default):
try:
thing_index = self.index(elem)
return thing_index
except ValueError:
return default
Which could then be used like:
mylist = SuperDuperList([0,1,2])
index = mylist.getindexdefault( 'asdf', -1 )
If you are doing this often then it is better to stow it away in a helper function:
def index_of(val, in_list):
try:
return in_list.index(val)
except ValueError:
return -1
There is nothing wrong with your code that uses ValueError
. Here's yet another one-liner if you'd like to avoid exceptions:
thing_index = next((i for i, x in enumerate(thing_list) if x == thing), -1)
What about this:
li = [1,2,3,4,5] # create list
li = dict(zip(li,range(len(li)))) # convert List To Dict
print( li ) # {1: 0, 2: 1, 3: 2, 4: 3, 5: 4}
li.get(20) # None
li.get(1) # 0
This issue is one of language philosophy. In Java for example there has always been a tradition that exceptions should really only be used in "exceptional circumstances" that is when errors have happened, rather than for flow control. In the beginning this was for performance reasons as Java exceptions were slow but now this has become the accepted style.
In contrast Python has always used exceptions to indicate normal program flow, like raising a ValueError
as we are discussing here. There is nothing "dirty" about this in Python style and there are many more where that came from. An even more common example is StopIteration
exception which is raised by an iterator‘s next()
method to signal that there are no further values.
What about this:
otherfunction(thing_collection, thing)
Rather than expose something so implementation-dependent like a list index in a function interface, pass the collection and the thing and let otherfunction deal with the "test for membership" issues. If otherfunction is written to be collection-type-agnostic, then it would probably start with:
if thing in thing_collection:
... proceed with operation on thing
which will work if thing_collection is a list, tuple, set, or dict.
This is possibly clearer than:
if thing_index != MAGIC_VALUE_INDICATING_NOT_A_MEMBER:
which is the code you already have in otherfunction.
I'd suggest:
if thing in thing_list:
list_index = -1
else:
list_index = thing_list.index(thing)
What about like this:
temp_inx = (L + [x]).index(x)
inx = temp_inx if temp_inx < len(L) else -1
It's been quite some time but it's a core part of the stdlib and has dozens of potential methods so I think it's useful to have some benchmarks for the different suggestions and include the numpy method which can be by far the fastest.
import random
from timeit import timeit
import numpy as np
l = [random.random() for i in range(10**4)]
l[10**4 - 100] = 5
# method 1
def fun1(l:list, x:int, e = -1) -> int:
return [[i for i,elem in enumerate(l) if elem == x] or [e]][0]
# method 2
def fun2(l:list, x:int, e = -1) -> int:
for i,elem in enumerate(l):
if elem == x:
return i
else:
return e
# method 3
def fun3(l:list, x:int, e = -1) -> int:
try:
idx = l.index(x)
except ValueError:
idx = e
return idx
# method 4
def fun4(l:list, x:int, e = -1) -> int:
return l.index(x) if x in l else e
l2 = np.array(l)
# method 5
def fun5(l:list or np.ndarray, x:int, e = -1) -> int:
res = np.where(np.equal(l, x))
if res[0].any():
return res[0][0]
else:
return e
if __name__ == "__main__":
print("Method 1:")
print(timeit(stmt = "fun1(l, 5)", number = 1000, globals = globals()))
print("")
print("Method 2:")
print(timeit(stmt = "fun2(l, 5)", number = 1000, globals = globals()))
print("")
print("Method 3:")
print(timeit(stmt = "fun3(l, 5)", number = 1000, globals = globals()))
print("")
print("Method 4:")
print(timeit(stmt = "fun4(l, 5)", number = 1000, globals = globals()))
print("")
print("Method 5, numpy given list:")
print(timeit(stmt = "fun5(l, 5)", number = 1000, globals = globals()))
print("")
print("Method 6, numpy given np.ndarray:")
print(timeit(stmt = "fun5(l2, 5)", number = 1000, globals = globals()))
print("")
When run as main, this results in the following printout on my machine indicating time in seconds to complete 1000 trials of each function:
Method 1: 0.7502102799990098
Method 2: 0.7291318440002215
Method 3: 0.24142152300009911
Method 4: 0.5253471979995084
Method 5, numpy given list: 0.5045417560013448
Method 6, numpy given np.ndarray: 0.011147511999297421
Of course the question asks specifically about lists so the best solution is to use the try-except method, however the speed improvements (at least 20x here compared to try-except) offered by using the numpy data structures and operators instead of python data structures is significant and if building something on many arrays of data that is performance critical then the author should try to use numpy throughout to take advantage of the superfast C bindings. (CPython interpreter, other interpreter performances may vary)
Btw, the reason Method 5 is much slower than Method 6 is because numpy first has to convert the given list to it's own numpy array, so giving it a list doesn't break it it just doesn't fully utilise the speed possible.
I'll be blunt: the answers here are very bad, and have insane time complexity.
Here's a simple way.
With dict().get('key', 'some_value')
, the value at 'key'
will be returned, and if the key is not in the dictionary, 'some_value'
will be returned.
You can create such a dictionary with your list and its indices.
mylist = ['cat' 'dog', 'bunny']
mapping = {value: index for index, value in enumerate(mylist)}
Then, mapping.get('key', 0)
will return the index if found, or None
.
mapping.get('penguin', 0) # returns 0
I have the same issue with the ".index()" method on lists. I have no issue with the fact that it throws an exception but I strongly disagree with the fact that it's a non-descriptive ValueError. I could understand if it would've been an IndexError, though.
I can see why returning "-1" would be an issue too because it's a valid index in Python. But realistically, I never expect a ".index()" method to return a negative number.
Here goes a one liner (ok, it's a rather long line ...), goes through the list exactly once and returns "None" if the item isn't found. It would be trivial to rewrite it to return -1, should you so desire.
indexOf = lambda list, thing: \
reduce(lambda acc, (idx, elem): \
idx if (acc is None) and elem == thing else acc, list, None)
How to use:
>>> indexOf([1,2,3], 4)
>>>
>>> indexOf([1,2,3], 1)
0
>>>
Comparison of implementations
Simple comparison on Python 3.8
TL;DR maybeidx2
is faster in general except for arrays (n<100) with lots of misses
def maybeidx1(l, v):
return l.index(v) if v in l else None
def maybeidx2(l, v):
try:
return l.index(v)
except ValueError:
return None
Test cases:
a = [*range(100_000)]
# Case 1: index in list
maybeidx1(a, 50_000)
Out[20]: 50000
maybeidx2(a, 50_000)
Out[21]: 50000
# Case 2: index not in list
maybeidx1(a, 100_000) is None
Out[23]: True
maybeidx2(a, 100_000) is None
Out[24]: True
Timing case 1
%timeit maybeidx1(a, 50_000)
1.06 ms ± 15.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit maybeidx2(a, 50_000)
530 µs ± 8.47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Timing case 2
%timeit maybeidx1(a, 100_000)
1.07 ms ± 21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit maybeidx2(a, 100_000)
1.07 ms ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Result
Use the maybeidx2
method for larger arrays. This is faster due to maybeidx1
having two scans of the array in search of the value - this is still O(n) time but has a constant multiplier of 2 and thus slower in practice. This holds in the case of the value being present in the list. When the value is not present, these times will be roughly equal; they both have to scan the full array exactly once, then return None
. The overhead of the try-except
is negligible, even with an array size of 10 - unless case two occurs. Then the try-except
overhead is noticeable. Example:
a = [*range(10)]
%timeit maybeidx1(a, 10)
191 ns ± 2.61 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit maybeidx2(a, 10)
566 ns ± 5.93 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
This overhead becomes negligible (on my machine) when a
has above 100 elements.
Since Python 3.6, there are find
and rfind
methods for index
and rindex
which return -1
instead of throwing exception.
I don't know why you should think it is dirty... because of the exception? if you want a oneliner, here it is:
thing_index = thing_list.index(elem) if thing_list.count(elem) else -1
but i would advise against using it; I think Ross Rogers solution is the best, use an object to encapsulate your desiderd behaviour, don't try pushing the language to its limits at the cost of readability.
精彩评论