User-Defined Iterators

In the __iter__ scheme, classes implement user-defined iterators by simply implementing the iteration protocol introduced in Chapters 14 and 20 (refer back to those chapters for more background details on iterators). For example, the following file, iters.py, defines a user-defined iterator class that generates squares:

class Squares:
    def __init__(self, start, stop):    # Save state when created
        self.value = start - 1
        self.stop  = stop
    def __iter__(self):                 # Get iterator object on iter
        return self
    def __next__(self):                 # Return a square on each iteration
        if self.value == self.stop:     # Also called by next built-in
            raise StopIteration
        self.value += 1
        return self.value ** 2

% python
>>> from iters import Squares
>>> for i in Squares(1, 5):             # for calls iter, which calls __iter__
...     print(i, end=' ')               # Each iteration calls __next__
...
1 4 9 16 25

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

Here, the iterator object is simply the instance self, because the __next__ method is part of this class. In more complex scenarios, the iterator object may be defined as a separate class and object with its own state information to support multiple active iterations over the same data (we’ll see an example of this in a moment). The end of the iteration is signaled with a Python raise statement (more on raising exceptions in the next part of this book). Manual iterations work as for built-in types as well:

>>> X = Squares(1, 5)                   # Iterate manually: what loops do
>>> I = iter(X)                         # iter calls __iter__
>>> next(I)                             # next calls __next__
1
>>> next(I)
4
...more omitted...
>>> next(I)
25
>>> next(I)                             # Can catch this in try statement
StopIteration

An equivalent coding of this iterator with __getitem__ might be less natural, because the for would then iterate through all offsets zero and higher; the offsets passed in would be only indirectly related to the range of values produced (0..N would need to map to start..stop). Because __iter__ objects retain explicitly managed state between next calls, they can be more general than __getitem__.

On the other hand, using iterators based on __iter__ can sometimes be more complex and less convenient than using __getitem__. They are really designed for iteration, not random indexing—in fact, they don’t overload the indexing expression at all:

>>> X = Squares(1, 5)
>>> X[1]
AttributeError: Squares instance has no attribute '__getitem__'

The __iter__ scheme is also the implementation for all the other iteration contexts we saw in action for __getitem__ (membership tests, type constructors, sequence assignment, and so on). However, unlike our prior __getitem__ example, we also need to be aware that a class’s __iter__ may be designed for a single traversal, not many. For example, the Squares class is a one-shot iteration; once you’ve iterated over an instance of that class, it’s empty. You need to make a new iterator object for each new iteration:

>>> X = Squares(1, 5)
>>> [n for n in X]                      # Exhausts items
[1, 4, 9, 16, 25]
>>> [n for n in X]                      # Now it's empty
[]
>>> [n for n in Squares(1, 5)]          # Make a new iterator object
[1, 4, 9, 16, 25]
>>> list(Squares(1, 3))
[1, 4, 9]

Notice that this example would probably be simpler if it were coded with generator functions or expressions (topics introduced in Chapter 20 and related to iterators):

>>> def gsquares(start, stop):
...     for i in range(start, stop+1):
...         yield i ** 2
...
>>> for i in gsquares(1, 5):                       # or: (x ** 2 for x in range(1, 5))
...     print(i, end=' ')
...
1 4 9 16 25

Unlike the class, the function automatically saves its state between iterations. Of course, for this artificial example, you could in fact skip both techniques and simply use a for loop, map, or a list comprehension to build the list all at once. The best and fastest way to accomplish a task in Python is often also the simplest:

>>> [x ** 2 for x in range(1, 6)]
[1, 4, 9, 16, 25]

However, classes may be better at modeling more complex iterations, especially when they can benefit from state information and inheritance hierarchies. The next section explores one such use case.