Parallel Traversals: zip and map

As we’ve seen, the range built-in allows us to traverse sequences with for in a nonexhaustive fashion. In the same spirit, the built-in zip function allows us to use for loops to visit multiple sequences in parallel. In basic operation, zip takes one or more sequences as arguments and returns a series of tuples that pair up parallel items taken from those sequences. For example, suppose we’re working with two lists:

>>> L1 = [1,2,3,4]
>>> L2 = [5,6,7,8]

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

To combine the items in these lists, we can use zip to create a list of tuple pairs (like range, zip is an iterable object in 3.0, so we must wrap it in a list call to display all its results at once—more on iterators in the next chapter):

>>> zip(L1, L2)
<zip object at 0x026523C8>
>>> list(zip(L1, L2))                       # list() required in 3.0, not 2.6
[(1, 5), (2, 6), (3, 7), (4, 8)]

Such a result may be useful in other contexts as well, but when wedded with the for loop, it supports parallel iterations:

>>> for (x, y) in zip(L1, L2):
...     print(x, y, '--', x+y)
...
1 5 -- 6
2 6 -- 8
3 7 -- 10
4 8 -- 12

Here, we step over the result of the zip call—that is, the pairs of items pulled from the two lists. Notice that this for loop again uses the tuple assignment form we met earlier to unpack each tuple in the zip result. The first time through, it’s as though we ran the assignment statement (x, y) = (1, 5).

The net effect is that we scan both L1 and L2 in our loop. We could achieve a similar effect with a while loop that handles indexing manually, but it would require more typing and would likely run more slowly than the for/zip approach.

Strictly speaking, the zip function is more general than this example suggests. For instance, it accepts any type of sequence (really, any iterable object, including files), and it accepts more than two arguments. With three arguments, as in the following example, it builds a list of three-item tuples with items from each sequence, essentially projecting by columns (technically, we get an N-ary tuple for N arguments):

>>> T1, T2, T3 = (1,2,3), (4,5,6), (7,8,9)
>>> T3
(7, 8, 9)
>>> list(zip(T1, T2, T3))
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

Moreover, zip truncates result tuples at the length of the shortest sequence when the argument lengths differ. In the following, we zip together two strings to pick out characters in parallel, but the result has only as many tuples as the length of the shortest sequence:

>>> S1 = 'abc'
>>> S2 = 'xyz123'
>>>
>>> list(zip(S1, S2))
[('a', 'x'), ('b', 'y'), ('c', 'z')]

map equivalence in Python 2.6

In Python 2.X, the related built-in map function pairs items from sequences in a similar fashion, but it pads shorter sequences with None if the argument lengths differ instead of truncating to the shortest length:

>>> S1 = 'abc'
>>> S2 = 'xyz123'

>>> map(None, S1, S2)                        # 2.X only
[('a', 'x'), ('b', 'y'), ('c', 'z'), (None, '1'), (None, '2'), (None,'3')]

This example is using a degenerate form of the map built-in, which is no longer supported in 3.0. Normally, map takes a function and one or more sequence arguments and collects the results of calling the function with parallel items taken from the sequence(s). We’ll study map in detail in Chapters 19 and 20, but as a brief example, the following maps the built-in ord function across each item in a string and collects the results (like zip, map is a value generator in 3.0 and so must be passed to list to collect all its results at once):

>>> list(map(ord, 'spam'))
[115, 112, 97, 109]

This works the same as the following loop statement, but is often quicker:

>>> res = []
>>> for c in 'spam': res.append(ord(c))
>>> res
[115, 112, 97, 109]


Note

Version skew note: The degenerate form of map using a function argument of None is no longer supported in Python 3.0, because it largely overlaps with zip (and was, frankly, a bit at odds with map’s function-application purpose). In 3.0, either use zip or write loop code to pad results yourself. We’ll see how to do this in Chapter 20, after we’ve had a chance to study some additional iteration concepts.


Dictionary construction with zip

In Chapter 8, I suggested that the zip call used here can also be handy for generating dictionaries when the sets of keys and values must be computed at runtime. Now that we’re becoming proficient with zip, I’ll explain how it relates to dictionary construction. As you’ve learned, you can always create a dictionary by coding a dictionary literal, or by assigning to keys over time:

>>> D1 = {'spam':1, 'eggs':3, 'toast':5}
>>> D1
{'toast': 5, 'eggs': 3, 'spam': 1}

>>> D1 = {}
>>> D1['spam']  = 1
>>> D1['eggs']  = 3
>>> D1['toast'] = 5

What to do, though, if your program obtains dictionary keys and values in lists at runtime, after you’ve coded your script? For example, say you had the following keys and values lists:

>>> keys = ['spam', 'eggs', 'toast']
>>> vals = [1, 3, 5]

One solution for turning those lists into a dictionary would be to zip the lists and step through them in parallel with a for loop:

>>> list(zip(keys, vals))
[('spam', 1), ('eggs', 3), ('toast', 5)]

>>> D2 = {}
>>> for (k, v) in zip(keys, vals): D2[k] = v
...
>>> D2
{'toast': 5, 'eggs': 3, 'spam': 1}

It turns out, though, that in Python 2.2 and later you can skip the for loop altogether and simply pass the zipped keys/values lists to the built-in dict constructor call:

>>> keys = ['spam', 'eggs', 'toast']
>>> vals = [1, 3, 5]

>>> D3 = dict(zip(keys, vals))
>>> D3
{'toast': 5, 'eggs': 3, 'spam': 1}

The built-in name dict is really a type name in Python (you’ll learn more about type names, and subclassing them, in Chapter 31). Calling it achieves something like a list-to-dictionary conversion, but it’s really an object construction request. In the next chapter we’ll explore a related but richer concept, the list comprehension, which builds lists in a single expression; we’ll also revisit 3.0 dictionary comprehensions an alternative to the dict cal for zipped key/value pairs.