Membership: __contains__, __iter__, and __getitem__

The iteration story is even richer than we’ve seen thus far. Operator overloading is often layered: classes may provide specific methods, or more general alternatives used as fallback options. For example:

 

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

 
  • Comparisons in Python 2.6 use specific methods such as __lt__ for less than if present, or else the general __cmp__. Python 3.0 uses only specific methods, not __cmp__, as discussed later in this chapter.
  • Boolean tests similarly try a specific __bool__ first (to give an explicit True/False result), and if it’s absent fall back on the more general __len__ (a nonzero length means True). As we’ll also see later in this chapter, Python 2.6 works the same but uses the name __nonzero__ instead of __bool__.

In the iterations domain, classes normally implement the in membership operator as an iteration, using either the __iter__ method or the __getitem__ method. To support more specific membership, though, classes may code a __contains__ method—when present, this method is preferred over __iter__, which is preferred over __getitem__. The __contains__ method should define membership as applying to keys for a mapping (and can use quick lookups), and as a search for sequences.

Consider the following class, which codes all three methods and tests membership and various iteration contexts applied to an instance. Its methods print trace messages when called:

class Iters:
    def __init__(self, value):
        self.data = value
    def __getitem__(self, i):                 # Fallback for iteration
        print('get[%s]:' % i, end='')         # Also for index, slice
        return self.data[i]
    def __iter__(self):                       # Preferred for iteration
        print('iter=> ', end='')              # Allows only 1 active iterator
        self.ix = 0
        return self
    def __next__(self):
        print('next:', end='')
        if self.ix == len(self.data): raise StopIteration
        item = self.data[self.ix]
        self.ix += 1
        return item
    def __contains__(self, x):                # Preferred for 'in'
        print('contains: ', end='')
        return x in self.data


X = Iters([1, 2, 3, 4, 5])          # Make instance
print(3 in X)                       # Membership
for i in X:                         # For loops
    print(i, end=' | ')

print()
print([i ** 2 for i in X])          # Other iteration contexts
print( list(map(bin, X)) )

I = iter(X)                         # Manual iteration (what other contexts do)
while True:
    try:
        print(next(I), end=' @ ')
    except StopIteration:
        break

When run as it is, this script’s output is as follows—the specific __contains__ intercepts membership, the general __iter__ catches other iteration contexts such that __next__ is called repeatedly, and __getitem__ is never called:

contains: True
iter=> next:1 | next:2 | next:3 | next:4 | next:5 | next:
iter=> next:next:next:next:next:next:[1, 4, 9, 16, 25]
iter=> next:next:next:next:next:next:['0b1', '0b10', '0b11', '0b100', '0b101']
iter=> next:1 @ next:2 @ next:3 @ next:4 @ next:5 @ next:

Watch what happens to this code’s output if we comment out its __contains__ method, though—membership is now routed to the general __iter__ instead:

iter=> next:next:next:True
iter=> next:1 | next:2 | next:3 | next:4 | next:5 | next:
iter=> next:next:next:next:next:next:[1, 4, 9, 16, 25]
iter=> next:next:next:next:next:next:['0b1', '0b10', '0b11', '0b100', '0b101']
iter=> next:1 @ next:2 @ next:3 @ next:4 @ next:5 @ next:

And finally, here is the output if both __contains__ and __iter__ are commented out—the indexing __getitem__ fallback is called with successively higher indexes for membership and other iteration contexts:

get[0]:get[1]:get[2]:True
get[0]:1 | get[1]:2 | get[2]:3 | get[3]:4 | get[4]:5 | get[5]:
get[0]:get[1]:get[2]:get[3]:get[4]:get[5]:[1, 4, 9, 16, 25]
get[0]:get[1]:get[2]:get[3]:get[4]:get[5]:['0b1', '0b10', '0b11', '0b100','0b101']
get[0]:1 @ get[1]:2 @ get[2]:3 @ get[3]:4 @ get[4]:5 @ get[5]:

As we’ve seen, the __getitem__ method is even more general: besides iterations, it also intercepts explicit indexing as well as slicing. Slice expressions trigger __getitem__ with a slice object containing bounds, both for built-in types and user-defined classes, so slicing is automatic in our class:

>>> X = Iters('spam')               # Indexing
>>> X[0]                            # __getitem__(0)
get[0]:'s'

>>> 'spam'[1:]                      # Slice syntax
'pam'
>>> 'spam'[slice(1, None)]          # Slice object
'pam'

>>> X[1:]                           # __getitem__(slice(..))
get[slice(1, None, None)]:'pam'
>>> X[:-1]
get[slice(None, −1, None)]:'spa'

In more realistic iteration use cases that are not sequence-oriented, though, the __iter__ method may be easier to write since it must not manage an integer index, and __contains__ allows for membership optimization as a special case.