Python if statement efficiency
A friend (fellow low skill level recreational python scripter) asked me to look over some code. I noticed that he had 7 separate statements that basically said.
if ( a and b and c):
do something
the statements a,b,c all tested their equality or lack of to set values. As I looked at it I found that because of the nature of the tests, I could re-write the whole logic block into 2 branches that never went more than 3 deep and rarely got past the first level (making the most rare occurrence test out first).
if a:
if b:
if c:
else:
if c:
else:
if b:
if c:
else:
if c:
To me, logically it seems like it should be faster if you are making less, simpler tests that fail faster and move on. My real questions are
1) When I say if and else, should the if be true, does the else get completely ignored?
2) In theory would
if (a and b and c)
take as much time as the three separate if statement开发者_开发百科s would?
I would say the single test is as fast as the separate tests. Python also makes use of so called short-circuit evaluation.
That means for (a and b and c)
, that b
or c
would not be tested anymore if a
is false
.
Similar, if you have an OR
expression (a or b)
and a
is true
, b
is never evaluated.
So to sum up, the clauses don't fail faster with separation.
if
statements will skip everything in an else
bracket if it evaluates to true. It should be noted that worrying about this sort of problem, unless it's done millions of times per program execution, is called "premature optimization" and should be avoided. If your code is clearer with three if (a and b and c)
statements, they should be left in.
Code:
import dis
def foo():
if ( a and b and c):
pass
else:
pass
def bar():
if a:
if b:
if c:
pass
print 'foo():'
dis.dis(foo)
print 'bar():'
dis.dis(bar)
Output:
foo():
4 0 LOAD_GLOBAL 0 (a)
3 JUMP_IF_FALSE 18 (to 24)
6 POP_TOP
7 LOAD_GLOBAL 1 (b)
10 JUMP_IF_FALSE 11 (to 24)
13 POP_TOP
14 LOAD_GLOBAL 2 (c)
17 JUMP_IF_FALSE 4 (to 24)
20 POP_TOP
5 21 JUMP_FORWARD 1 (to 25)
>> 24 POP_TOP
7 >> 25 LOAD_CONST 0 (None)
28 RETURN_VALUE
bar():
10 0 LOAD_GLOBAL 0 (a)
3 JUMP_IF_FALSE 26 (to 32)
6 POP_TOP
11 7 LOAD_GLOBAL 1 (b)
10 JUMP_IF_FALSE 15 (to 28)
13 POP_TOP
12 14 LOAD_GLOBAL 2 (c)
17 JUMP_IF_FALSE 4 (to 24)
20 POP_TOP
13 21 JUMP_ABSOLUTE 29
>> 24 POP_TOP
25 JUMP_ABSOLUTE 33
>> 28 POP_TOP
>> 29 JUMP_FORWARD 1 (to 33)
>> 32 POP_TOP
>> 33 LOAD_CONST 0 (None)
36 RETURN_VALUE
So, although the setup is the same, the cleanup for the combined expression is faster since it leaves only a single value on the stack.
At least in python, efficiency is second to readability and "Flat is better than nested".
See The Zen of Python
If you are worried about b or c being functions that are called instead of just variables that are evaluated, then this code shows that short-circuiting is your friend:
a = False
def b():
print "b was called"
return True
if a and b():
print "this shouldn't happen"
else:
print "if b was not called, then short-circuiting works"
prints
if b was not called, then short-circuiting works
But if you have code that does this:
a = call_to_expensive_function_A()
b = call_to_expensive_function_B()
c = call_to_expensive_function_C()
if a and b and c:
do something...
then your code is still calling all 3 expensive functions. Better to let Python be Python:
if (call_to_expensive_function_A() and
call_to_expensive_function_B() and
call_to_expensive_function_C())
do something...
which will only call as many expensive functions as necessary to determine the overall condition.
Edit
You can generalize this using the all
built-in:
# note, this is a list of the functions themselves
# the functions are *not* called when creating this list
funcs = [function_A, function_B, function_C]
if all(fn() for fn in funcs):
do something
Now if you have to add other functions, or want to reorder them (maybe function_A
is very time-consuming, and you would benefit by filtering cases that fail function_B
or function_C
first), you just update the funcs
list. all
does short-circuiting just as if you had spelled out the if as if a and b and c
. (If functions are 'or'ed together, use any
builtin instead.)
I doubt you'd see a measurable difference so I'd recommend doing whatever makes the code most readable.
if (a and b and c)
will fail if a
is falsy, and not bother checking b
or c
.
That said, I personally feel that nested conditionals are easier to read than 2^n combinations of conditionals.
In general, if you want to determine which way of doing something is fastest, you can write a simple benchmark using timeit.
if (a and b and c)
is faster and better, for the sake of the "Real Programmer's code optimization" and code readability.
Only to say that the construct
if a:
if b:
if c:
else: #note A
if c:
else:
if b:
if c:
else: #note A
if c:
is not well constructed. in "(a and b and c)" construct the interpreter (and most lang compiled code) simply ignore the conditions following a false value, so the "#note A" else branch shouldn't exists. Writing (a and b and c) do it implicitly.
精彩评论