Coding Non-ASCII Text

To code non-ASCII characters, you may use hex or Unicode escapes in your strings; hex escapes are limited to a single byte’s value, but Unicode escapes can name characters with values two and four bytes wide. The hex values 0xCD and 0xE8, for instance, are codes for two special accented characters outside the 7-bit range of ASCII, but we can embed them in 3.0 str objects because str supports Unicode today:

>>> chr(0xc4)            # 0xC4, 0xE8: characters outside ASCII's range
'Ä'
>>> chr(0xe8)
'è'

>>> S = '\xc4\xe8'       # Single byte 8-bit hex escapes
>>> S
'Äè'

>>> S = '\u00c4\u00e8'   # 16-bit Unicode escapes
>>> S
'Äè'
>>> len(S)               # 2 characters long (not number of bytes!)
2

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元