Other Built-in Type Iterators

Besides files and physical sequences like lists, other types have useful iterators as well. The classic way to step through the keys of a dictionary, for example, is to request its keys list explicitly:

>>> D = {'a':1, 'b':2, 'c':3}
>>> for key in D.keys():
...     print(key, D[key])
...
a 1
c 3
b 2

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

In recent versions of Python, though, dictionaries have an iterator that automatically returns one key at a time in an iteration context:

>>> I = iter(D)
>>> next(I)
'a'
>>> next(I)
'c'
>>> next(I)
'b'
>>> next(I)
Traceback (most recent call last):
...more omitted...
StopIteration

The net effect is that we no longer need to call the keys method to step through dictionary keys—the for loop will use the iteration protocol to grab one key each time through:

>>> for key in D:
...     print(key, D[key])
...
a 1
c 3
b 2

We can’t delve into their details here, but other Python object types also support the iterator protocol and thus may be used in for loops too. For instance, shelves (an access-by-key filesystem for Python objects) and the results from os.popen (a tool for reading the output of shell commands) are iterable as well:

>>> import os
>>> P = os.popen('dir')
>>> P.__next__()
' Volume in drive C is SQ004828V03\n'
>>> P.__next__()
' Volume Serial Number is 08BE-3CD4\n'
>>> next(P)
TypeError: _wrap_close object is not an iterator

Notice that popen objects support a P.next() method in Python 2.6. In 3.0, they support the P.__next__() method, but not the next(P) built-in; since the latter is defined to call the former, it’s not clear if this behavior will endure in future releases (as described in an earlier footnote, this appears to be an implementation issue). This is only an issue for manual iteration, though; if you iterate over these objects automatically with for loops and other iteration contexts (described in the next sections), they return successive lines in either Python version.

The iteration protocol also is the reason that we’ve had to wrap some results in a list call to see their values all at once. Objects that are iterable return results one at a time, not in a physical list:

>>> R = range(5)
>>> R                            # Ranges are iterables in 3.0
range(0, 5)
>>> I = iter(R)                  # Use iteration protocol to produce results
>>> next(I)
0
>>> next(I)
1
>>> list(range(5))               # Or use list to collect all results at once
[0, 1, 2, 3, 4]

Now that you have a better understanding of this protocol, you should be able to see how it explains why the enumerate tool introduced in the prior chapter works the way it does:

>>> E = enumerate('spam')        # enumerate is an iterable too
>>> E
<enumerate object at 0x0253F508>
>>> I = iter(E)
>>> next(I)                      # Generate results with iteration protocol
(0, 's')
>>> next(I)                      # Or use list to force generation to run
(1, 'p')
>>> list(enumerate('spam'))
[(0, 's'), (1, 'p'), (2, 'a'), (3, 'm')]

We don’t normally see this machinery because for loops run it for us automatically to step through results. In fact, everything that scans left-to-right in Python employs the iteration protocol in the same way—including the topic of the next section.

 


[33] Terminology in this topic tends to be a bit loose. This text uses the terms “iterable” and “iterator” interchangeably to refer to an object that supports iteration in general. Sometimes the term “iterable” refers to an object that supports iter and “iterator” refers to an object return by iter that supports next(I), but that convention is not universal in either the Python world or this book.

[34] Technically speaking, the for loop calls the internal equivalent of I.__next__, instead of the next(I) used here. There is rarely any difference between the two, but as we’ll see in the next section, there are some built-in objects in 3.0 (such as os.popen results) that support the former and not the latter, but may be still be iterated across in for loops. Your manual iterations can generally use either call scheme. If you care for the full story, in 3.0 os.popen results have been reimplemented with the subprocess module and a wrapper class, whose __getattr__ method is no longer called in 3.0 for implicit __next__ fetches made by the next built-in, but is called for explicit fetches by name—a 3.0 change issue we’ll confront in Chapters 37 and 38, which apparently burns some standard library code too! Also in 3.0, the related 2.6 calls os.popen2/3/4 are no longer available; use subprocess.Popen with appropriate arguments instead (see the Python 3.0 library manual for the new required code).