re.match function
Syntax:
re.match(pattern, string, flags=0)
re.match attempts to match a pattern from the starting position of the string, if it is not the starting position, the match is successful If so, match() returns none.
re.search Function
Syntax:
re.search(pattern, string, flags=0)
re.search Scans the entire string and returns the first successful match.
The parameters of re.match and re.search are basically the same. The specific descriptions are as follows:
Parameters | Description |
pattern | Matching regular expression |
string | String to match |
flags | Flags, used to control the matching method of regular expressions, such as: whether to be case-sensitive |
then they What's the difference?
re.match only matches the beginning of the string. If the beginning of the string does not match the regular expression, the match fails and the function returns None; while re.search matches the entire string until a match is found. This is the difference between them.
re.match and re.search have many detailed introductions on the Internet, but in personal use, I still like to use re.findall
Look at the following examples for comparison. The difference between re.search and re.findall is also the use of multi-grouping. Specifically look at the comments and compare the output results:
Example:
#!/usr/bin/env python3 # -*- coding: UTF-8 -*- # 提取图片的地址 import re a = '<img src="https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg">' # 使用 re.search search = re.search('<img src="(.*)">', a) # group(0) 是一个完整的分组 print(search.group(0)) print(search.group(1)) # 使用 re.findall findall = re.findall('<img src="(.*)">', a) print(findall) # 多个分组的使用(比如我们需要提取 img 字段和图片地址字段) re_search = re.search('<(.*) src="(.*)">', a) # 打印 img print(re_search.group(1)) # 打印图片地址 print(re_search.group(2)) # 打印 img 和图片地址,以元祖的形式 print(re_search.group(1, 2)) # 或者使用 groups print(re_search.groups())
Output results:
<img src="https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg"> https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg ['https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg'] img https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg ('img', 'https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg') ('img', 'https://s-media-cache-ak0.pinimg.com/originals/a8/c4/9e/a8c49ef606e0e1f3ee39a7b219b5c05e.jpg')
Finally, regular expressions are very powerful tools and can usually It is used to solve problems that cannot be solved by the built-in functions of strings, and regular expressions are available in most languages. Python has many uses, but regular expressions are indispensable in both the crawler and data analysis modules. So regular expressions are really important for learning Python. Finally, it comes with some common regular expressions and regular expression metacharacters and syntax documents supported by Python.
Next Section