Other Ways to Code Strings

So far, we’ve looked at the string object’s sequence operations and type-specific methods. Python also provides a variety of ways for us to code strings, which we’ll explore in greater depth later. For instance, special characters can be represented as backslash escape sequences:

>>> S = 'A\nB\tC'            # \n is end-of-line, \t is tab
>>> len(S)                   # Each stands for just one character
5

>>> ord('\n')                # \n is a byte with the binary value 10 in ASCII
10

>>> S = 'A\0B\0C'            # \0, a binary zero byte, does not terminate string
>>> len(S)
5

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

Python allows strings to be enclosed in single or double quote characters (they mean the same thing). It also allows multiline string literals enclosed in triple quotes (single or double)—when this form is used, all the lines are concatenated together, and end-of-line characters are added where line breaks appear. This is a minor syntactic convenience, but it’s useful for embedding things like HTML and XML code in a Python script:

>>> msg = """ aaaaaaaaaaaaa
bbb'''bbbbbbbbbb""bbbbbbb'bbbb
cccccccccccccc"""
>>> msg
'\naaaaaaaaaaaaa\nbbb\'\'\'bbbbbbbbbb""bbbbbbb\'bbbb\ncccccccccccccc'

Python also supports a raw string literal that turns off the backslash escape mechanism (such string literals start with the letter r), as well as Unicode string support that supports internationalization. In 3.0, the basic str string type handles Unicode too (which makes sense, given that ASCII text is a simple kind of Unicode), and a bytes type represents raw byte strings; in 2.6, Unicode is a separate type, and str handles both 8-bit strings and binary data. Files are also changed in 3.0 to return and accept str for text and bytes for binary data. We’ll meet all these special string forms in later chapters.