Regular expressions are expressions used to express a set of strings concisely. This article mainly shares with you the detailed knowledge of regular expressions in Python. I hope it can help you.
Operator |
Description |
Example |
. |
represents any single character |
|
[ ] |
character set, the single character value range is |
[abc] means a or b or c; [a-z] means a to z single character |
[^ ] |
Not Character set, single character exclusion range |
[^abc] means non-a or non-b or non-c |
* |
0 or unlimited expansions of the previous character |
abc* means ab, abc, abcc, abccc...
|
+ |
One or unlimited expansion of the previous character |
abc+ means abc, abcc, abccc...
|
? |
0 or 1 expansion of the previous character |
abc? means ab, abc
|
| |
represents any |
abc|def represents abc or def
|
{m} |
M times expansion of the previous character |
ab{2} means abcc
|
{m,n} |
m to n expansions of the previous character (including n) |
ab{1, 2} means abc, abcc
|
##^ | matches the beginning of the string |
^abc represents abc and is at the beginning of a string
|
$ | matches the end of the string |
abc$ represents abc And at the end of a string
|
( ) | grouping mark, only the | operator can be used internally |
(abc|def ) represents abc or def
|
\d | number, equivalent to [0-9] |
|
\w | word characters, equivalent to [A-Za-z0-9_] |
|
#If you are familiar with the above operators, the following example is not difficult.
1. Only numbers can be entered: ^[0-9]*$
2. Only n-digit numbers can be entered: ^\d{n}$
3. Only numbers with at least n digits can be entered: ^\d{n,}$
4. Only numbers with m~n digits can be entered: ^\d{m,n}$
5. Only numbers starting with zero and non-zero can be entered: ^(0|[1-9][0-9]*)$
6. Only positive real numbers with two decimal places can be entered :^[0-9]+(.[0-9]{2})?$
7. Only positive real numbers with 1 to 3 decimal places can be entered: ^[0-9]+( .[0-9]{1,3})?$
8. Only non-zero positive integers can be entered: ^+?[1-9][0-9]*$
【Python3 Regular Expression】
Function | Description |
re.match() | Matches a pattern from the starting position of the string. If the starting position is not matched successfully, match() returns none. |
re.search() | Scan the entire string and return the first successful match. |
re.sub() | is used to replace all substrings matching the regular expression in the string and return the replaced string |
re.findall() | Search for a string and return all matching substrings in list form |
re.split() | Cut the string according to the regular expression matching results and return a list |
re.finditer() | Search for the string and return a matching result The iteration type, each iteration element is a match object |
>>> match= re.findall(r'[1-9]\d{5}','100081BIT BIT10008676')>>> print(match)
['100081', '100086']>>> match = re.split(r'[1-9]\d{5}','100081BIT BIT10008676')>>> match
['', 'BIT BIT', '76']>>> match = re.split(r'[1-9]\d{5}','100081BIT BIT10008676',maxsplit=1)>>> match
['', 'BIT BIT10008676']
>>>for m in re.finditer(r'[1-9]\d{5}','100081BIT BIT10008676'): if m:
print(m.group(0))
100081100086
The difference between re.match and re.search
re.match only matches the beginning of the string. If the beginning of the string does not match the regular expression, the match fails and the function returns None; while re. search matches the entire string until a match is found.
Operator |
Description |
Example |
. |
represents any single character |
|
[ ] |
Character set, single character value range |
[abc] represents a or b or c; [a-z] represents a to z single character |
[^ ] |
Non-character set, single character exclusion range |
[^abc] means non-a or non-b or non-c |
* |
0 or unlimited expansion of the previous character |
abc* means ab, abc, abcc, abccc...
|
+ |
One or unlimited expansion of the previous character |
abc+ means abc , abcc, abccc...
|
##? | 0 or 1 expansion of the previous character |
abc? represents ab, abc
|
##|
represents any |
| abc|def represents abc or def
|
{m}
m times expansion of the previous character |
| ab{ 2} means abcc
|
{m,n}
m to n expansions of the previous character (including n) |
| ab{1,2} means abc, abcc
| ##^
matches the beginning of the string |
^abc | represents abc and is at the beginning of a string
| $
matches the end of the string |
abc$ | means abc and is at the end of a string
| ( )
Grouping mark, only the | operator# can be used internally | ##(abc|def) represents | abc or def
\d |
number, which is equivalent to [0-9]
|
|
##\w
| word character, equivalent to [A-Za-z0-9_]
|
|
If you are familiar with the above operators, the following example is not difficult. |
1. Only numbers can be entered: ^[0-9]*$
2. Only n-digit numbers can be entered: ^\d{n}$ 3. Only numbers with at least n digits can be entered: ^\d{n,}$ 4. Only numbers with m~n digits can be entered: ^\d{m,n}$5. Only numbers starting with zero and non-zero can be entered: ^(0|[1-9][0-9]*)$6. Only positive real numbers with two decimal places can be entered :^[0-9]+(.[0-9]{2})?$7. Only positive real numbers with 1 to 3 decimal places can be entered: ^[0-9]+( .[0-9]{1,3})?$8. Only non-zero positive integers can be entered: ^+?[1-9][0-9]*$
【Python3 Regular Expression】
Function
Description
|
| re.match()
Matches a pattern from the starting position of the string. If the starting position is not matched successfully, match() returns none.
| re.search() | Scan the entire string and return the first successful match.
| re.sub() | is used to replace all substrings matching the regular expression in the string and return the replaced string
| re.findall() | Search for a string and return all matching substrings in list form
| re.split() | Cut the string according to the regular expression matching results and return a list
| re.finditer() | Search for the string and return a matching result Iteration type, each iteration element is a match object
| >>> match= re.findall(r'[1-9]\d{5}','100081BIT BIT10008676')>>> print(match)
['100081', '100086']>>> match = re.split(r'[1-9]\d{5}','100081BIT BIT10008676')>>> match
['', 'BIT BIT', '76']>>> match = re.split(r'[1-9]\d{5}','100081BIT BIT10008676',maxsplit=1)>>> match
['', 'BIT BIT10008676']
>>>for m in re.finditer(r'[1-9]\d{5}','100081BIT BIT10008676'): if m:
print(m.group(0))
100081100086
The difference between re.match and re.search |
re.match only matches strings At the beginning, if the string does not match the regular expression at the beginning, the match fails and the function returns None; while re.search matches the entire string until a match is found.
Related recommendations:
Detailed explanation of js regular expressions
php regular expressions Detailed explanation of expressions_PHP tutorial
Very important detailed explanation of php regular expressions, detailed explanation of php regular expressions
The above is the detailed content of Detailed explanation of regular expressions in Python. For more information, please follow other related articles on the PHP Chinese website!