How do I get a list of indices of non zero elements in a list?
I have a list that will always contain only ones and zeroes. I need to get a list of the non-zero indices of the list:
a = [0, 1, 0, 1, 0, 0, 0, 0]
b = []
for i in range(le开发者_如何学运维n(a)):
if a[i] == 1: b.append(i)
print b
What would be the 'pythonic' way of achieving this ?
[i for i, e in enumerate(a) if e != 0]
Not really a "new" answer but numpy has this built in as well.
import numpy as np
a = [0, 1, 0, 1, 0, 0, 0, 0]
nonzeroind = np.nonzero(a)[0] # the return is a little funny so I use the [0]
print nonzeroind
[1 3]
Since THC4k mentioned compress (available in python2.7+)
>>> from itertools import compress, count
>>> x = [0, 1, 0, 1, 0, 0, 0, 0]
>>> compress(count(), x)
<itertools.compress object at 0x8c3666c>
>>> list(_)
[1, 3]
Just wished to add explanation for 'funny' output from the previous asnwer. Result is a tuple that contains vectors of indexes for each dimension of the matrix. In this case user is processing what is considered a vector in numpy, so output is tuple with one element.
import numpy as np
a = [0, 1, 0, 1, 0, 0, 0, 0]
nonzeroind = np.nonzero(a)
print nonzeroind
(array([1, 3]),)
Time comparison of the two answers w.r.t length of the list
a = [int(random.random()>0.5) for i in range(10)]
%timeit [i for i, e in enumerate(a) if e != 0]
683 ns ± 14 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit np.nonzero(a)[0]
4.43 µs ± 102 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
a = [int(random.random()>0.5) for i in range(1000)]
%timeit [i for i, e in enumerate(a) if e != 0]
53.1 µs ± 2.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.nonzero(a)[0]
73.8 µs ± 2.71 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
a = [int(random.random()>0.5) for i in range(100000)]
%timeit [i for i, e in enumerate(a) if e != 0]
5.86 ms ± 79.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.nonzero(a)[0]
6.61 ms ± 14.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
With list length of 100000, changing the amount of ones in the list
a = [int(random.random()>0.1) for i in range(100000)]
%timeit [i for i, e in enumerate(a) if e != 0]
6.45 ms ± 28.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.nonzero(a)[0]
5.74 ms ± 9.25 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
a = [int(random.random()>0.9) for i in range(100000)]
%timeit [i for i, e in enumerate(a) if e != 0]
4.69 ms ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.nonzero(a)[0]
5.74 ms ± 6.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Amount of ones affects only the first option. np.nonzero() is better with high amount of non-zero elements. If the length is less than 10000, the first option is faster.
精彩评论