Python's [<generator expression>] at least 3x faster than list(<generator expression>)?
It appears that using [] around a generator expression (test1) behaves substantially better than putting it inside of list() (test2). The slowdown isn't there when I simply pass a list into list() for shallow copy (test3). Why is this?
Evidence:
from timeit import Timer
t1 = Timer("test1()", "from __main__ import test1")
t2 = Timer("test2()", "from __main__ import test2")
t3 = Timer("test3()", "from __main__ import test3")
x = [34开发者_StackOverflow社区534534, 23423523, 77645645, 345346]
def test1():
[e for e in x]
print t1.timeit()
#0.552290201187
def test2():
list(e for e in x)
print t2.timeit()
#2.38739395142
def test3():
list(x)
print t3.timeit()
#0.515818119049
Machine: 64 bit AMD, Ubuntu 8.04, Python 2.7 (r27:82500)
Well, my first step was to set the two tests up independently to ensure that this is not a result of e.g. the order in which the functions are defined.
>python -mtimeit "x=[34534534, 23423523, 77645645, 345346]" "[e for e in x]"
1000000 loops, best of 3: 0.638 usec per loop
>python -mtimeit "x=[34534534, 23423523, 77645645, 345346]" "list(e for e in x)"
1000000 loops, best of 3: 1.72 usec per loop
Sure enough, I can replicate this. OK, next step is to have a look at the bytecode to see what's actually going on:
>>> import dis
>>> x=[34534534, 23423523, 77645645, 345346]
>>> dis.dis(lambda: [e for e in x])
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x0000000001F8B330, file "<stdin>", line 1>)
3 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (x)
9 GET_ITER
10 CALL_FUNCTION 1
13 RETURN_VALUE
>>> dis.dis(lambda: list(e for e in x))
1 0 LOAD_GLOBAL 0 (list)
3 LOAD_CONST 0 (<code object <genexpr> at 0x0000000001F8B9B0, file "<stdin>", line 1>)
6 MAKE_FUNCTION 0
9 LOAD_GLOBAL 1 (x)
12 GET_ITER
13 CALL_FUNCTION 1
16 CALL_FUNCTION 1
19 RETURN_VALUE
Notice that the first method creates the list directly, whereas the second method creates a genexpr
object and passes that to the global list
. This is probably where the overhead lies.
Note also that the difference is approximately a microsecond i.e. utterly trivial.
Other interesting data
This still holds for non-trivial lists
>python -mtimeit "x=range(100000)" "[e for e in x]"
100 loops, best of 3: 8.51 msec per loop
>python -mtimeit "x=range(100000)" "list(e for e in x)"
100 loops, best of 3: 11.8 msec per loop
and for less trivial map functions:
>python -mtimeit "x=range(100000)" "[2*e for e in x]"
100 loops, best of 3: 12.8 msec per loop
>python -mtimeit "x=range(100000)" "list(2*e for e in x)"
100 loops, best of 3: 16.8 msec per loop
and (though less strongly) if we filter the list:
>python -mtimeit "x=range(100000)" "[e for e in x if e%2]"
100 loops, best of 3: 14 msec per loop
>python -mtimeit "x=range(100000)" "list(e for e in x if e%2)"
100 loops, best of 3: 16.5 msec per loop
list(e for e in x)
isn't a list comprehension, it's a genexpr
object (e for e in x)
being created and passed to the list
factory function. Presumably the object creation and method calls create overhead.
In python list
name must be looked up in the module and then in builtins. While you cannot change what a list comprehension means a list call must just be a standard lookup + function call as it could be redefined to be something else.
Looking at the vm code generated for a comprehension it can be seen that it is inlined while a call to list is a normal call.
>>> import dis
>>> def foo():
... [x for x in xrange(4)]
...
>>> dis.dis(foo)
2 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 0 (_[1])
7 LOAD_GLOBAL 0 (xrange)
10 LOAD_CONST 1 (4)
13 CALL_FUNCTION 1
16 GET_ITER
>> 17 FOR_ITER 13 (to 33)
20 STORE_FAST 1 (x)
23 LOAD_FAST 0 (_[1])
26 LOAD_FAST 1 (x)
29 LIST_APPEND
30 JUMP_ABSOLUTE 17
>> 33 DELETE_FAST 0 (_[1])
36 POP_TOP
37 LOAD_CONST 0 (None)
40 RETURN_VALUE
>>> def bar():
... list(x for x in xrange(4))
...
>>> dis.dis(bar)
2 0 LOAD_GLOBAL 0 (list)
3 LOAD_CONST 1 (<code object <genexpr> at 0x7fd1230cf468, file "<stdin>", line 2>)
6 MAKE_FUNCTION 0
9 LOAD_GLOBAL 1 (xrange)
12 LOAD_CONST 2 (4)
15 CALL_FUNCTION 1
18 GET_ITER
19 CALL_FUNCTION 1
22 CALL_FUNCTION 1
25 POP_TOP
26 LOAD_CONST 0 (None)
29 RETURN_VALUE
Your test2 is roughly equivalent to:
def test2():
def local():
for i in x:
yield i
return list(local())
The call overhead explains the increased processing time.
精彩评论