Home >Backend Development >Python Tutorial >Seven Python regular expression usage examples
As a concept, regular expressions are not unique to Python. However, there are still some minor differences in the actual use of regular expressions in Python.
This article is part of a series of articles about Python regular expressions. In this first article in this series, we will focus on how to use regular expressions in Python and highlight some of the unique features of Python.
We will introduce some methods of searching and finding strings in Python. Then we'll discuss how to use grouping to process the sub-items of the matching objects we find.
The module for regular expressions in Python that we are interested in using is usually called 're'.
1. Primitive type string in Python
The Python compiler uses 'Seven Python regular expression usage examples' (backslash) to represent characters Escape characters in string constants.
If the backslash is followed by a string of special characters recognized by the compiler, then the entire escape sequence will be replaced with the corresponding special character (for example, 'Seven Python regular expression usage examplesn' will be replaced by the compiler with a newline character ).
But this poses a problem for using regular expressions in Python, because backslashes are also used in the 're' module to escape special characters in regular expressions (such as * and +) .
The mixture of the two means that sometimes you have to escape the escape character itself (when the special character is recognized by both Python and the regular expression compiler), but at other times you This is not necessary (if the special characters are only recognized by the Python compiler).
Instead of focusing on figuring out how many backslashes are needed, we can use raw strings instead.
Primitive type strings can be created simply by adding an 'r' character before the double quotes of an ordinary string. When a string is of primitive type, the Python compiler does not attempt any substitutions. Essentially, you are telling the compiler not to interfere with your string at all.
Using regular expressions to find in Python
're' module provides several methods for input strings Make exact inquiries. The methods we will discuss are:
Each method accepts a regular expression and a string to find a match. Let's look at each of these methods in more detail to understand how they work and how they differ.
2. Use re.match to search – matching starts
Let’s take a look at the match() method first. The way the match() method works is that it only finds a match if the beginning of the string being searched matches the pattern.
For example, if you call the math() method on the string 'dog cat dog', the search pattern 'dog' will match:
We will The group() method will be discussed more later. For now, we just need to know that we called it with 0 as its argument, and that the group() method returns the matching pattern found.
I have skipped the returned SRE_Match object for now, we will discuss it soon.
However, if we call the math() method on the same string, looking for the pattern 'cat', no match will be found.
3. Use re.search to find – match any position
The search() method is similar to match(), but search The () method does not restrict us from looking for matches only from the beginning of the string, so looking for 'cat' in our example string will find a match:
However The search() method stops when it finds a match, so searching for 'dog' using the searc() method in our example string will only find the first occurrence of it.
4. Use re.findall - all matching objects
The search method I use most in Python so far is findall() method. When we call the findall() method, we can very simply get a list of all matching patterns instead of getting the match object (we will discuss the match object more next). For me it's simpler. Calling the findall() method on the example string we get:
5. Use the match.start and match.end methods
So, what is the 'match' object" previously returned to us by the search() and match() methods?
Unlike simply returning the matching part of the string, the "matching object" returned by search() and match() is actually a wrapper class for matching substrings.
Earlier you saw that I can get the matched substring by calling the group() method, (we will see in the next section that the match object is actually very useful when dealing with grouping problems), but the match object is still Contains more information about matching substrings.
For example, the match object can tell us where the matched content begins and ends in the original string:
Knowing this information is sometimes very useful.
6. Group by numbers using mathch.group
As I mentioned before, the match object is very handy for handling grouping.
Grouping is the ability to locate specific substrings of an entire regular expression. We can define a group as part of the entire regular expression, and then locate the matched content of this part separately.
Let’s see how it works:
The string I just created looks like one taken from someone’s address book A fragment. We can match this line with a regular expression like this:
By surrounding the regular expression with parentheses (the characters '(' and ')') We can group content into specific sections and then process these subgroups individually.
These groups can be obtained by using the group() method of the group object. They can be located by the numerical order in which they appear from left to right in the regular expression (starting from 1):
The reason why the ordinal number of a group starts from 1 is because The 0th group is reserved to store all matching objects (we saw it when we studied the match() method and search() method before).
7. Use match.group to group by alias
Sometimes, especially when a regular expression has many groups, by group Positioning based on the order of appearance will become unrealistic. Python also allows you to specify a group name through the following statement:
We can still use the group() method to get the contents of the group, but this time we have to use what we have Specify the group name instead of the number of digits of the group used previously.
This greatly enhances the clarity and readability of the code. You can imagine that as regular expressions become more complex, it becomes more and more difficult to understand what a group captures. Naming your groups will clearly tell you and your readers your intentions.
Although the findall() method does not return a grouped object, it can also use grouping. Similarly, the findall() method will return a collection of tuples, where the Nth element in each tuple corresponds to the Nth grouping in the regular expression.
However, naming the group does not apply to the findall() method.
In this article we introduced some basics of using regular expressions in Python. We learned about the primitive string type (and how it can help you solve some of the headaches of using regular expressions). We also learned how to use the match(), search(), and findall() methods to perform basic queries, and how to use grouping to handle subcomponents of matched objects.
As always, if you want to see more on this topic, the official Python documentation for the re module is a great resource.
In future articles, we will discuss the application of regular expressions in Python in more depth. We'll take a more comprehensive look at match objects, learn how to use them to perform substitutions within strings, and even use them to parse Python data structures from text files.
The above is the detailed content of Seven Python regular expression usage examples. For more information, please follow other related articles on the PHP Chinese website!