同步阅读进度,多语言翻译,过滤屏幕蓝光,评论分享,更多完整功能,更好读书体验,试试 阅读 ‧ 电子书库
The re Pattern Matching Module
Python’s re pattern-matching module supports text processing that is more general than that afforded by simple string method calls such as find, split, and replace. With re, strings that designate searching and splitting targets can be described by general patterns, instead of absolute text. This module has been generalized to work on objects of any string type in 3.0—str, bytes, and bytearray—and returns result substrings of the same type as the subject string.
Here it is at work in 3.0, extracting substrings from a line of text. Within pattern strings, (.*) means any character (.), zero or more times (*), saved away as a matched substring (()). Parts of the string matched by the parts of a pattern enclosed in parentheses are available after a successful match, via the group or groups method:
C:\misc> c:\python30\python
>>> import re
>>> S = 'Bugger all down here on earth!' # Line of text
>>> B = b'Bugger all down here on earth!' # Usually from a file
>>> re.match('(.*) down (.*) on (.*)', S).groups() # Match line to pattern
('Bugger all', 'here', 'earth!') # Matched substrings
>>> re.match(b'(.*) down (.*) on (.*)', B).groups() # bytes substrings
(b'Bugger all', b'here', b'earth!')
In Python 2.6 results are similar, but the unicode type is used for non-ASCII text, and str handles both 8-bit and binary text:
C:\misc> c:\python26\python
>>> import re
>>> S = 'Bugger all down here on earth!' # Simple text and binary
>>> U = u'Bugger all down here on earth!' # Unicode text
>>> re.match('(.*) down (.*) on (.*)', S).groups()
('Bugger all', 'here', 'earth!')
>>> re.match('(.*) down (.*) on (.*)', U).groups()
(u'Bugger all', u'here', u'earth!')
Since bytes and str support essentially the same operation sets, this type distinction is largely transparent. But note that, like in other APIs, you can’t mix str and bytes types in its calls’ arguments in 3.0 (although if you don’t plan to do pattern matching on binary data, you probably don’t need to care):
C:\misc> c:\python30\python
>>> import re
>>> S = 'Bugger all down here on earth!'
>>> B = b'Bugger all down here on earth!'
>>> re.match('(.*) down (.*) on (.*)', B).groups()
TypeError: can't use a string pattern on a bytes-like object
>>> re.match(b'(.*) down (.*) on (.*)', S).groups()
TypeError: can't use a bytes pattern on a string-like object
>>> re.match(b'(.*) down (.*) on (.*)', bytearray(B)).groups()
(bytearray(b'Bugger all'), bytearray(b'here'), bytearray(b'earth!'))
>>> re.match('(.*) down (.*) on (.*)', bytearray(B)).groups()
TypeError: can't use a string pattern on a bytes-like object
请支持我们,让我们可以支付服务器费用。
使用微信支付打赏
