Home >Backend Development >Python Tutorial >How to Find All Overlapping Matches Using Python Regex?
Finding Overlapping Matches Using Regex in Python
In search operations, it's often necessary to identify and retrieve multiple occurrences of a specific pattern within a larger text. When the matches overlap, standard regex matching techniques may miss some instances. This question explores how to find all overlapping matches using Python's regular expressions.
The goal is to extract every 10-digit set of numbers within a given number sequence. For instance, in the string "123456789123456789," we aim to obtain:
[1234567891,2345678912,3456789123,4567891234,5678912345,6789123456,7891234567,8912345678,9123456789]
Capturing Groups and Lookaheads
To achieve this, we employ a capturing group within a lookahead. The lookahead identifies the text of interest (here, the 10-digit numbers), but the actual match is the zero-width substring before the lookahead. This results in non-overlapping matches.
Implementation
Using the finditer method, we can obtain the matches as follows:
import re s = "123456789123456789" matches = re.finditer(r'(?=(\d{10}))', s) results = [int(match.group(1)) for match in matches]
The output results returns the desired list of overlapping matches:
[1234567891, 2345678912, 3456789123, 4567891234, 5678912345, 6789123456, 7891234567, 8912345678, 9123456789]
This approach efficiently extracts all overlapping occurrences of the specified pattern, providing a valuable technique for comprehensive text processing.
The above is the detailed content of How to Find All Overlapping Matches Using Python Regex?. For more information, please follow other related articles on the PHP Chinese website!