预计阅读本页时间:-
The struct Binary Data Module
The Python struct module, used to create and extract packed binary data from strings, also works the same in 3.0 as it does in 2.X, but packed data is represented as bytes and bytearray objects only, not str objects (which makes sense, given that it’s intended for processing binary data, not arbitrarily encoded text).
Here are both Pythons in action, packing three objects into a string according to a binary type specification (they create a four-byte integer, a four-byte string, and a two-byte integer):
广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元
C:\misc> c:\python30\python
>>> from struct import pack
>>> pack('>i4sh', 7, 'spam', 8) # bytes in 3.0 (8-bit string)
b'\x00\x00\x00\x07spam\x00\x08'
C:\misc> c:\python26\python
>>> from struct import pack
>>> pack('>i4sh', 7, 'spam', 8) # str in 2.6 (8-bit string)
'\x00\x00\x00\x07spam\x00\x08'
Since bytes has an almost identical interface to that of str in 3.0 and 2.6, though, most programmers probably won’t need to care—the change is irrelevant to most existing code, especially since reading from a binary file creates a bytes automatically. Although the last test in the following example fails on a type mismatch, most scripts will read binary data from a file, not create it as a string:
C:\misc> c:\python30\python
>>> import struct
>>> B = struct.pack('>i4sh', 7, 'spam', 8)
>>> B
b'\x00\x00\x00\x07spam\x00\x08'
>>> vals = struct.unpack('>i4sh', B)
>>> vals
(7, b'spam', 8)
>>> vals = struct.unpack('>i4sh', B.decode())
TypeError: 'str' does not have the buffer interface
Apart from the new syntax for bytes, creating and reading binary files works almost the same in 3.0 as it does in 2.X. Code like this is one of the main places where programmers will notice the bytes object type:
C:\misc> c:\python30\python
# Write values to a packed binary file
>>> F = open('data.bin', 'wb') # Open binary output file
>>> import struct
>>> data = struct.pack('>i4sh', 7, 'spam', 8) # Create packed binary data
>>> data # bytes in 3.0, not str
b'\x00\x00\x00\x07spam\x00\x08'
>>> F.write(data) # Write to the file
10
>>> F.close()
# Read values from a packed binary file
>>> F = open('data.bin', 'rb') # Open binary input file
>>> data = F.read() # Read bytes
>>> data
b'\x00\x00\x00\x07spam\x00\x08'
>>> values = struct.unpack('>i4sh', data) # Extract packed binary data
>>> values # Back to Python objects
(7, b'spam', 8)
Once you’ve extracted packed binary data into Python objects like this, you can dig even further into the binary world if you have to—strings can be indexed and sliced to get individual bytes’ values, individual bits can be extracted from integers with bitwise operators, and so on (see earlier in this book for more on the operations applied here):
>>> values # Result of struct.unpack
(7, b'spam', 8)
# Accesssing bits of parsed integers
>>> bin(values[0]) # Can get to bits in ints
'0b111'
>>> values[0] & 0x01 # Test first (lowest) bit in int
1
>>> values[0] | 0b1010 # Bitwise or: turn bits on
15
>>> bin(values[0] | 0b1010) # 15 decimal is 1111 binary
'0b1111'
>>> bin(values[0] ^ 0b1010) # Bitwise xor: off if both true
'0b1101'
>>> bool(values[0] & 0b100) # Test if bit 3 is on
True
>>> bool(values[0] & 0b1000) # Test if bit 4 is set
False
Since parsed bytes strings are sequences of small integers, we can do similar processing with their individual bytes:
# Accessing bytes of parsed strings and bits within them
>>> values[1]
b'spam'
>>> values[1][0] # bytes string: sequence of ints
115
>>> values[1][1:] # Prints as ASCII characters
b'pam'
>>> bin(values[1][0]) # Can get to bits of bytes in strings
'0b1110011'
>>> bin(values[1][0] | 0b1100) # Turn bits on
'0b1111111'
>>> values[1][0] | 0b1100
127
Of course, most Python programmers don’t deal with binary bits; Python has higher-level object types, like lists and dictionaries, that are generally a better choice for representing information in Python scripts. However, if you must use or produce lower-level data used by C programs, networking libraries, or other interfaces, Python has tools to assist.