预计阅读本页时间:-
Literals and Basic Properties
Python 3.0 string objects originate when you call a built-in function such as str or bytes, process a file created by calling open (described in the next section), or code literal syntax in your script. For the latter, a new literal form, b'xxx' (and equivalently, B'xxx') is used to create bytes objects in 3.0, and bytearray objects may be created by calling the bytearray function, with a variety of possible arguments.
More formally, in 3.0 all the current string literal forms—'xxx', "xxx", and triple-quoted blocks—generate a str; adding a b or B just before any of them creates a bytes instead. This new b'...' bytes literal is similar in form to the r'...' raw string used to suppresses backslash escapes. Consider the following, run in 3.0:
广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元
C:\misc> c:\python30\python
>>> B = b'spam' # Make a bytes object (8-bit bytes)
>>> S = 'eggs' # Make a str object (Unicode characters, 8-bit or wider)
>>> type(B), type(S)
(<class 'bytes'>, <class 'str'>)
>>> B # Prints as a character string, really sequence of ints
b'spam'
>>> S
'eggs'
The bytes object is actually a sequence of short integers, though it prints its content as characters whenever possible:
>>> B[0], S[0] # Indexing returns an int for bytes, str for str
(115, 'e')
>>> B[1:], S[1:] # Slicing makes another bytes or str object
(b'pam', 'ggs')
>>> list(B), list(S)
([115, 112, 97, 109], ['e', 'g', 'g', 's']) # bytes is really ints
The bytes object is immutable, just like str (though bytearray, described later, is not); you cannot assign a str, bytes, or integer to an offset of a bytes object. The bytes prefix also works for any string literal form:
>>> B[0] = 'x' # Both are immutable
TypeError: 'bytes' object does not support item assignment
>>> S[0] = 'x'
TypeError: 'str' object does not support item assignment
>>> B = B""" # bytes prefix works on single, double, triple quotes
... xxxx
... yyyy
... """
>>> B
b'\nxxxx\nyyyy\n'
As mentioned earlier, in Python 2.6 the b'xxx' literal is present for compatibility but is the same as 'xxx' and makes a str, and bytes is just a synonym for str; as you’ve seen, in 3.0 both of these address the distinct bytes type. Also note that the u'xxx' and U'xxx' Unicode string literal forms in 2.6 are gone in 3.0; use 'xxx' instead, since all strings are Unicode, even if they contain all ASCII characters (more on writing non-ASCII Unicode text in the section Coding Non-ASCII Text).