开发者

What are the advantages of "yield item" vs return iter(items)?

In the examples below, resp.results is an iterator.

Version1 :

items = []
for result in resp.results:
     item = process(result)
     items.append(item)
return iter(items)

Version 2:

for result in resp.results:
     yield process(result)

Is returning iter(items) in Version 1 any better/worse in terms of performance/memory savings than simply returning items?

In the "Python Cookbook," Alex says the explicit iter() is "more flexible but less often used," 开发者_开发百科but what are the pros/cons of returning iter(items) vs yield as in Version 2?

Also, what are the best ways to unittest an iterator and/or yield? -- you can't do len(results) to check the size of the list?


It's easy to turn an iterator or generator back into a list if you need it:

results = [item for item in iterator]

Or as kindly pointed out in the comments, an even simpler method:

results = list(iterator)


The first causes all the results to be calculated and stored while the second is a lazy load, whereby the results are calculated only when requested. That is, one will store and create a list of N items, while the other will store and create 0 items until you begin iterating through them.

A better way of thinking about this is using ifilter (from itertools) wherein you are doing much the same as yield except you're generating an iterator instead of a generator:

 ifilter(process, resp.results)

I've found that iterators are generally faster executing than generators in the 2.x series but I can not verify any cost savings in the 3.x series.


When you are processing a very large list, then yield item is better since it does not consume much memory.

See an excellent article in generator http://www.dabeaz.com/generators/Generators.pdf


You can create infinite iterators, but not infinite lists:

def fibGen():
    f0, f1 = 0, 1
    while True:
        yield f0
        f0, f1 = f1, f0+f1


The pro and con of the former snippet is that all the results are calculated up front. This is useful if the time between retrieving each item is crucial, but won't do if the iterable is infinite or if space is a concern.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜