Python regular expression types include matching specific characters, repeated characters, selection, grouping and reference, predefined patterns, boundary conditions, greedy and non-greedy matching, etc. Detailed introduction: 1. Match specific characters: .: match any character (except newline); ^: match the beginning of the input string; $: match the end of the input string; \d: match any number, equivalent to [0- 9]; \D: matches any non-numeric characters, equivalent to [^0-9]; \s: matches any whitespace characters (including spaces, tabs, form feeds, etc.), etc.
Operating system for this tutorial: Windows 10 system, Dell G3 computer.
Regular Expression (Regular Expression) in Python is a powerful text processing tool that can match, search, replace or split complex string patterns. Here are some common Python regular expressions:
-
Matches specific characters:
- .: Matches any character (except newline).
- ^: Matches the beginning of the input string.
- $: Matches the end of the input string.
- \d: Matches any number, equivalent to [0-9].
- \D: Matches any non-numeric characters, equivalent to [^0-9].
- \s: Matches any whitespace characters (including spaces, tabs, form feeds, etc.).
- \S: Matches any non-whitespace character.
- \w: Matches any letters or numbers or underscores, equivalent to [a-zA-Z0-9_].
- \W: Matches any non-letter, non-digit, and non-underscore characters, equivalent to [^a-zA-Z0-9_].
-
Repeating characters:
- *: Matches the previous subexpression zero or more times.
- : Matches the previous subexpression one or more times.
- ?: Matches the previous subexpression zero or one time.
- {n}: n is a non-negative integer. Match a certain number of n times.
- {n,}: n is a non-negative integer. Match at least n times.
- {n,m}: m and n are both non-negative integers. Match at least n times and at most m times.
-
Selection, grouping and citation:
- |: Indicates or, for example, a|b matches 'a' or 'b'.
- ( ): Combine several items into one unit, for example (abc) and abc match the same content. The captured content can be referenced by \1,\2,\3... etc.
- \: Escape special characters, such as \() to match the real "(" character instead of as a grouping character.
-
Predefined pattern :
- \d or \D: Matches one or more numeric or non-numeric characters.
- \s or \S: Matches one or more whitespace or non-whitespace characters.
- .: In the re module, . cannot be used directly because it is considered a special character. If you want to match any character (including newlines), you can use such as [\s\S] or [^ \s] mode.
-
Boundary conditions:
- ^: indicates negation outside square brackets, and can also indicate the beginning of a string. Non-negative integers are represented within square brackets. For example, [0-9]^ represents a string of numbers starting with 0.
- $: represents the end of the string and can also represent the dollar sign. Within square brackets Represents a negative integer, for example, [-1]^ represents a string of numbers ending with -1.
Greedy and non-greedy matching:
By default, regular expressions are greedy, that is, they match as much as possible (as long as other requirements are met). You can use ? to make a regular expression non-greedy (match as few matches as possible). For example, when looking for all words starting with "a", "a*" will match as many "a" characters as possible, and "a*?" will match only the minimum "a" characters to satisfy the condition.
The above is the detailed content of What are python regular expressions?. For more information, please follow other related articles on the PHP Chinese website!
Statement:The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn