开发者

Optimize a list comprehension

Here is a code snippet that shows the code I would like to optimize:

result = [(item, foo(item))
          for item in item_list
          if cond1(item) and cond2(foo(item))]

In the above snippet I call foo(item) twice. I can't think of a way to iterate over the list only once maintain both item and foo(item) for the conditional and the result list.

That is, I would like to keep item and foo(item) without having to loop over the list twice and without having to call foo(item) twice.

I know I can do it with a second nested list comprehension:

result = [(item, foo_item)
          for item, foo_item in [(i, foo(i)) for i开发者_如何学Go in item_list]
          if cond1(item) and cond2(foo_item)]

but that appears to loop through item_list twice which I would like to avoid.

So the first example calls foo twice per list item. The second example loops through the list twice (or appears to). I'd like to loop one time and call foo once for each item.


Like I've been repeatedly told here, the best thing in such cases is not to use a list comprehension at all:

result = []
for item in item_list:
    if cond1(item):
        value = foo(item)
        if cond2(value):
            result.append((item, value))

But I am stubbborn, so let's see what I can come up with (and keep the comprehension) (oh, wait -- I got your code all wrong. Still - unwrapping and having intermediate variables is the straight way for not to repeat the call)


It doesn't, but here:

result = [(item, foo_item)
    for item, foo_item in ((i, foo(i)) for i in item_list)
    if cond1(item) and cond2(foo_item)]

Turning the inner list comprehension into a generator expression makes sure that we don't use an unnecessary temporary list.


Use generator expressions.

result = [(item, foo_item)
          for item, foo_item in ((i, foo(i)) for i in item_list)
          if cond1(item) and cond2(foo_item)]

The interpreter will go through every element exactly once, because generator expression will calculate (i, foo(i)) only when it is required by the outer loop.

Assuming that foo is expensive and has no side effects, I'd even try to do this:

result = [(item, foo_item)
          for item, foo_item in ((i, foo(i)) for i in item_list if cond1(i))
          if cond2(foo_item)]

so that foo will not be called for elements which do not pass the first condition. Actually this looks better for me when written functionally:

from itertools import imap, ifilter
result = filter((lambda i,f:cond2(f)),
           imap((lambda i:(i, foo(i))),
             ifilter(cond1, item_list)))

...but I might be subjective.


How does this look?

result = [ (i, fi) for i  in item_list if cond1(i)
                   for fi in (foo(i),) if cond2(fi) ]


This is one of the many reasons that we have generators:

def generator( items ):
    for item in items:
        if cond1(item):
            food = foo(item)
            if food:
                yield item, food

result = list(generator(item_list))

LCs are only good when they look good - if you have to spread them over 3 lines just to make them readable it's a bad idea.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜