Home  >  Article  >  Backend Development  >  How do we find the exact position of each match in Python's regular expression?

How do we find the exact position of each match in Python's regular expression?

王林
王林forward
2023-08-31 12:13:34642browse

How do we find the exact position of each match in Pythons regular expression?

Introduction

The re module is the regular expression we use in Python. Regular expressions are used for text searches and more complex text operations. Tools like grep and sed, text editors like vi and emacs, and computer languages ​​like Tcl, Perl, and Python all have built-in regular expression support.

The re module in Python provides functions for matching regular expressions.

Regular expressions that define the text we want to find or modify are called patterns. Text literals and metacharacters make up this string. Compiled functions are used to create schemas. It is recommended to use raw strings because regular expressions often contain special characters. (The r character is used to indicate a raw string.) These characters are not interpreted until combined into a pattern.

A pattern can be applied to a text string using one of these functions, and the pattern is used after assembly is complete. Available functions include Match, Search, Find, and Finditer.

Syntax used

The regular expression function used here is: We use the regular expression function to find matches.

re.match(): Determines if the RE matches at the beginning of the string. If zero or more characters at the beginning of the string match the regular expression pattern, the match method returns a match object.

p.finditer(): Finds all substrings where the RE matches and returns them as an iterator. An iterator delivering match objects across all non-overlapping matches for the pattern in a string is the result of the finditer method.

re.compile(): Compile a regular expression pattern into a regular expression object, which can be used for matching using its match(), search(), and other methods described below. The expression’s behavior can be modified by specifying a flag's value. Values can be any of the following variables combined using bitwise OR (the | operator).

m.start(): m.start() returns the offset in the string at the match's start.

m.group(): You may use the multiple-assignment approach to assign each value to a different variable when mo.groups() returns a tuple of values, as in the areaCode, mainNumber = mo.groups() line below.

search: It is comparable to re.match() but does not require that we just look for matches at the beginning of the text. The search() function can locate a pattern in the string at any location, but it only returns the first instance of the pattern.

Algorithm

  • Use import re to import the regular expression module.

  • Use the re.compile() function to create a regular expression object. (Remember to use the original string.)

  • Pass the string to be searched for to the finditer() method of the Regex object. This will return a Match object.

  • Calling the group() method of the Match object returns the actual matched text string.

  • We can also use the span() method to get the starting and ending indexes in a tuple.

Example

 #importing re functions
import re
#compiling [A-Z0-9] and storing it in a variable p
p = re.compile("[A-Z0-9]")
#looping m times in p.finditer
for m in p.finditer('A5B6C7D8'):
#printing the m.start and m.group
   print m.start(), m.group()

Output

This will produce the output −

0 A
1 5
2 B
3 6
4 C
5 7
6 D
7 8

Code explanation

Use import re Import the regular expression module. Use the re.compile() function to create a regular expression object ("[A-Z0-9]") and assign it to the variable p. Use a loop to iterate over m and pass the string you want to search for to the finditer() method of the regular expression object. This will return a Match object. Call the Match object's m.group() and m.start() methods to return the string that actually matched the text.

Example

# Python program to illustrate
# Matching regex objects
# with groups
import re
phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
mo = phoneNumRegex.search('My number is 415-555-4242.')
print(mo.groups())

Output

This will produce the output −

('415', '555-4242')

Code explanation

Use import re to import the regular expression module. Use the re.compile() function to create a regular expression object (r'(\d\d\d)-(\d\d\d-\d\d\d\d)') and assign it to Variable phoneNumRegex. Pass the string to be searched to the search() method of the Regex object and store it in the variable mo. This will return a Match object. Call the Match object's mo.groups() method to return the actual matched text string.

Conclusion

The search(), match() and finditer() methods provided by the Python re module allow us to match regular expression patterns, and if the match is successful, it will provide a Match object instance. Use this Match object's start(), end(), and span() methods to obtain detailed information about the matched string.

When there are many matches, you may run the risk of memory overload if you use findall() to load them all. You can get an iterator object of all potential matches by using the finditer() method, which will improve efficiency.

This means that finditer() provides a callable object that, when called, loads the results into memory.

The above is the detailed content of How do we find the exact position of each match in Python's regular expression?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:tutorialspoint.com. If there is any infringement, please contact admin@php.cn delete