同步阅读进度,多语言翻译,过滤屏幕蓝光,评论分享,更多完整功能,更好读书体验,试试 阅读 ‧ 电子书库
Converting Encodings
So far, we’ve been encoding and decoding strings to inspect their structure. More generally, we can always convert a string to a different encoding than the source character set default, but we must provide an explicit encoding name to encode to and decode from:
>>> S = 'AÄBèC'
>>> S
'AÄBèC'
>>> S.encode() # Default utf-8 encoding
b'A\xc3\x84B\xc3\xa8C'
>>> T = S.encode('cp500') # Convert to EBCDIC
>>> T
b'\xc1c\xc2T\xc3'
>>> U = T.decode('cp500') # Convert back to Unicode
>>> U
'AÄBèC'
>>> U.encode() # Default utf-8 encoding again
b'A\xc3\x84B\xc3\xa8C'
Keep in mind that the special Unicode and hex character escapes are only necessary when you code non-ASCII Unicode strings manually. In practice, you’ll often load such text from files instead. As we’ll see later in this chapter, 3.0’s file object (created with the open built-in function) automatically decodes text strings as they are read and encodes them when they are written; because of this, your script can often deal with strings generically, without having to code special characters directly.
Later in this chapter we’ll also see that it’s possible to convert between encodings when transferring strings to and from files, using a technique very similar to that in the last example; although you’ll still need to provide explicit encoding names when opening a file, the file interface does most of the conversion work for you automatically.
请支持我们,让我们可以支付服务器费用。
使用微信支付打赏
