开发者

Python if statement efficiency

A friend (fellow low skill level recreational python scripter) asked me to look over some code. I noticed that he had 7 separate statements that basically said.

if ( a and b and c):
    do something

the statements a,b,c all tested their equality or lack of to set values. As I looked at it I found that because of the nature of the tests, I could re-write the whole logic block into 2 branches that never went more than 3 deep and rarely got past the first level (making the most rare occurrence test out first).

if a:
    if b:
        if c:
    else:
        if c:
else:
    if b:
        if c:
    else:
        if c:

To me, logically it seems like it should be faster if you are making less, simpler tests that fail faster and move on. My real questions are

1) When I say if and else, should the if be true, does the else get completely ignored?

2) In theory would

if (a and b and c)

take as much time as the three separate if statement开发者_开发百科s would?


I would say the single test is as fast as the separate tests. Python also makes use of so called short-circuit evaluation.

That means for (a and b and c), that b or c would not be tested anymore if a is false.

Similar, if you have an OR expression (a or b) and a is true, b is never evaluated.

So to sum up, the clauses don't fail faster with separation.


if statements will skip everything in an else bracket if it evaluates to true. It should be noted that worrying about this sort of problem, unless it's done millions of times per program execution, is called "premature optimization" and should be avoided. If your code is clearer with three if (a and b and c) statements, they should be left in.


Code:

import dis

def foo():
  if ( a and b and c):
    pass
  else:
    pass

def bar():
  if a:
    if b:
      if c:
        pass

print 'foo():'
dis.dis(foo)
print 'bar():'
dis.dis(bar)

Output:

foo():
  4           0 LOAD_GLOBAL              0 (a)
              3 JUMP_IF_FALSE           18 (to 24)
              6 POP_TOP             
              7 LOAD_GLOBAL              1 (b)
             10 JUMP_IF_FALSE           11 (to 24)
             13 POP_TOP             
             14 LOAD_GLOBAL              2 (c)
             17 JUMP_IF_FALSE            4 (to 24)
             20 POP_TOP             

  5          21 JUMP_FORWARD             1 (to 25)
        >>   24 POP_TOP             

  7     >>   25 LOAD_CONST               0 (None)
             28 RETURN_VALUE        
bar():
 10           0 LOAD_GLOBAL              0 (a)
              3 JUMP_IF_FALSE           26 (to 32)
              6 POP_TOP             

 11           7 LOAD_GLOBAL              1 (b)
             10 JUMP_IF_FALSE           15 (to 28)
             13 POP_TOP             

 12          14 LOAD_GLOBAL              2 (c)
             17 JUMP_IF_FALSE            4 (to 24)
             20 POP_TOP             

 13          21 JUMP_ABSOLUTE           29
        >>   24 POP_TOP             
             25 JUMP_ABSOLUTE           33
        >>   28 POP_TOP             
        >>   29 JUMP_FORWARD             1 (to 33)
        >>   32 POP_TOP             
        >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE        

So, although the setup is the same, the cleanup for the combined expression is faster since it leaves only a single value on the stack.


At least in python, efficiency is second to readability and "Flat is better than nested".

See The Zen of Python


If you are worried about b or c being functions that are called instead of just variables that are evaluated, then this code shows that short-circuiting is your friend:

a = False
def b():
    print "b was called"
    return True

if a and b():
    print "this shouldn't happen"
else:
    print "if b was not called, then short-circuiting works"

prints

if b was not called, then short-circuiting works

But if you have code that does this:

a = call_to_expensive_function_A()
b = call_to_expensive_function_B()
c = call_to_expensive_function_C()

if a and b and c:
    do something...

then your code is still calling all 3 expensive functions. Better to let Python be Python:

if (call_to_expensive_function_A() and
    call_to_expensive_function_B() and
    call_to_expensive_function_C())
    do something...

which will only call as many expensive functions as necessary to determine the overall condition.

Edit

You can generalize this using the all built-in:

# note, this is a list of the functions themselves
# the functions are *not* called when creating this list
funcs = [function_A, function_B, function_C]

if all(fn() for fn in funcs):
    do something

Now if you have to add other functions, or want to reorder them (maybe function_A is very time-consuming, and you would benefit by filtering cases that fail function_B or function_C first), you just update the funcs list. all does short-circuiting just as if you had spelled out the if as if a and b and c. (If functions are 'or'ed together, use any builtin instead.)


I doubt you'd see a measurable difference so I'd recommend doing whatever makes the code most readable.


if (a and b and c) will fail if a is falsy, and not bother checking b or c.

That said, I personally feel that nested conditionals are easier to read than 2^n combinations of conditionals.

In general, if you want to determine which way of doing something is fastest, you can write a simple benchmark using timeit.


if (a and b and c) is faster and better, for the sake of the "Real Programmer's code optimization" and code readability.


Only to say that the construct

if a:
  if b:
    if c:
  else:    #note A
    if c:
else:
  if b:
    if c:
  else:    #note A
    if c:

is not well constructed. in "(a and b and c)" construct the interpreter (and most lang compiled code) simply ignore the conditions following a false value, so the "#note A" else branch shouldn't exists. Writing (a and b and c) do it implicitly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜