开发者

How does __iter__ work?

Despite reading up on it, I still dont quite understand how __iter__ works. What would be a simple explaination?

I开发者_运维知识库've seen def__iter__(self): return self. I don't see how this works or the steps on how this works.


As simply as I can put it:

__iter__ defines a method on a class which will return an iterator (an object that successively yields the next item contained by your object).

The iterator object that __iter__() returns can be pretty much any object, as long as it defines a next() method.

The next method will be called by statements like for ... in ... to yield the next item, and next() should raise the StopIteration exception when there are no more items.

What's great about this is it lets you define how your object is iterated, and __iter__ provides a common interface that every other python function knows how to work with.


An iterator needs to define two methods: __iter__() and __next__() (next() in python2). Usually, the object itself defines the __next__() or next() method, so it just returns itself as the iterator. This creates an iterable that is also itself an iterator. These methods are used by for and in statements.

  • Python 3 docs: docs.python.org/3/library/stdtypes.html#iterator-types

  • Python 2 docs: docs.python.org/2/library/stdtypes.html#iterator-types


The specs for def __iter__(self): are: it returns an iterator. So, if self is an iterator, return self is clearly appropriate.

"Being an iterator" means "having a __next__(self) method" (in Python 3; in Python 2, the name of the method in question is unfortunately plain next instead, clearly a name design glitch for a special method).

In Python 2.6 and higher, the best way to implement an iterator is generally to use the appropriate abstract base class from the collections standard library module -- in Python 2.6, the code might be (remember to call the method __next__ instead in Python 3):

import collections

class infinite23s(collections.Iterator):
  def next(self): return 23

an instance of this class will return infinitely many copies of 23 when iterated on (like itertools.repeat(23)) so the loop must be terminated otherwise. The point is that subclassing collections.Iterator adds the right __iter__ method on your behalf -- not a big deal here, but a good general principle (avoid repetitive, boilerplate code like iterators' standard one-line __iter__ -- in repetition, there's no added value and a lot of subtracted value!-).


A class supporting the __iter__ method will return an iterator object instance: an object supporting the next() method. This object will be usuable in the statements "for" and "in".


In Python, an iterator is any object that supports the iterator protocol. Part of that protocol is that the object must have an __iter__() method that returns the iterator object. I suppose this gives you some flexibility so that an object can pass on the iterator responsibilities to an internal class, or create some special object. In any case, the __iter__() method usually has only one line and that line is often simply return self

The other part of the protocol is the next() method, and this is where the real work is done. This method has to figure out or create or get the next thing, and return it. It may need to keep track of where it is so that the next time it is called, it really does return the next thing.

Once you have an object that returns the next thing in a sequence, you can collapse a for loop that looks like this:

myname = "Fredericus"
x = []
for i in [1,2,3,4,5,6,7,8,9,10]:
   x.append(myname[i-1])
   i = i + 1 # get the next i
print x

into this:

myname = "Fredericus"
x = [myname[i] for i in range(10)]
print x

Notice that there is nowhere where we have code that gets the next value of i because range(10) is an object that FOLLOWS the iterator protocol, and the list comprehension is a construct that USES the iterator protocol.

You can also USE the iterator protocol directly. For instance, when writing scripts to process CSV files, I often write this:

mydata = csv.reader(open('stuff.csv')
mydata.next()
for row in mydata:
    # do something with the row.

I am using the iterator directly by calling next() to skip the header row, then using it indirectly via the builtin in operator in the for statement.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜