Text and Binary Modes in 3.0

In Python 2.6, there is no major distinction between text and binary files—both accept and return content as str strings. The only major difference is that text files automatically map \n end-of-line characters to and from \r\n on Windows, while binary files do not (I’m stringing operations together into one-liners here just for brevity):

C:\misc> c:\python26\python
>>> open('temp', 'w').write('abd\n')         # Write in text mode: adds \r
>>> open('temp', 'r').read()                 # Read in text mode: drops \r
'abd\n'
>>> open('temp', 'rb').read()                # Read in binary mode: verbatim
'abd\r\n'

>>> open('temp', 'wb').write('abc\n')        # Write in binary mode
>>> open('temp', 'r').read()                 # \n not expanded to \r\n
'abc\n'
>>> open('temp', 'rb').read()
'abc\n'

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

In Python 3.0, things are bit more complex because of the distinction between str for text data and bytes for binary data. To demonstrate, let’s write a text file and read it back in both modes in 3.0. Notice that we are required to provide a str for writing, but reading gives us a str or a bytes, depending on the open mode:

C:\misc> c:\python30\python

# Write and read a text file

>>> open('temp', 'w').write('abc\n')         # Text mode output, provide a str
4

>>> open('temp', 'r').read()                 # Text mode input, returns a str
'abc\n'

>>> open('temp', 'rb').read()                # Binary mode input, returns a bytes
b'abc\r\n'

Notice how on Windows text-mode files translate the \n end-of-line character to \r\n on output; on input, text mode translates the \r\n back to \n, but binary mode does not. This is the same in 2.6, and it’s what we want for binary data (no translations should occur), although you can control this behavior with extra open arguments in 3.0 if desired.

Now let’s do the same again, but with a binary file. We provide a bytes to write in this case, and we still get back a str or a bytes, depending on the input mode:

# Write and read a binary file

>>> open('temp', 'wb').write(b'abc\n')       # Binary mode output, provide a bytes
4

>>> open('temp', 'r').read()                 # Text mode input, returns a str
'abc\n'

>>> open('temp', 'rb').read()                # Binary mode input, returns a bytes
b'abc\n'

Note that the \n end-of-line character is not expanded to \r\n in binary-mode output—again, a desired result for binary data. Type requirements and file behavior are the same even if the data we’re writing to the binary file is truly binary in nature. In the following, for example, the "\x00" is a binary zero byte and not a printable character:

# Write and read truly binary data

>>> open('temp', 'wb').write(b'a\x00c')      # Provide a bytes
3

>>> open('temp', 'r').read()                 # Receive a str
'a\x00c'

>>> open('temp', 'rb').read()                # Receive a bytes
b'a\x00c'

Binary-mode files always return contents as a bytes object, but accept either a bytes or bytearray object for writing; this naturally follows, given that bytearray is basically just a mutable variant of bytes. In fact, most APIs in Python 3.0 that accept a bytes also allow a bytearray:

# bytearrays work too

>>> BA = bytearray(b'\x01\x02\x03')

>>> open('temp', 'wb').write(BA)
3

>>> open('temp', 'r').read()
'\x01\x02\x03'

>>> open('temp', 'rb').read()
b'\x01\x02\x03'