집 >백엔드 개발 >파이썬 튜토리얼 >Python의 정규 표현식에서 각 일치 항목의 정확한 위치를 어떻게 찾나요?

Python의 정규 표현식에서 각 일치 항목의 정확한 위치를 어떻게 찾나요?

王林앞으로: 2023-08-31 12:13:34739검색

Introduction

re 모듈은 Python에서 사용하는 정규식입니다. 정규식은 텍스트 검색 및 보다 복잡한 텍스트 작업에 사용됩니다. grep 및 sed와 같은 도구, vi 및 emacs와 같은 텍스트 편집기, Tcl, Perl 및 Python과 같은 컴퓨터 언어에는 모두 정규식 지원이 내장되어 있습니다.

Python의 re 모듈은 정규식 일치를 위한 함수를 제공합니다.

찾거나 수정하려는 텍스트를 정의하는 정규 표현식을 패턴이라고 합니다. 텍스트 리터럴과 메타 문자가 이 문자열을 구성합니다. 컴파일된 함수는 스키마를 생성하는 데 사용됩니다. 정규식에는 특수 문자가 포함되는 경우가 많으므로 원시 문자열을 사용하는 것이 좋습니다. (r 문자는 원시 문자열을 나타내는 데 사용됩니다.) 이러한 문자는 패턴으로 결합될 때까지 해석되지 않습니다.

함수 중 하나를 사용하여 텍스트 문자열에 패턴을 적용할 수 있으며, 패턴은 조립이 완료된 후에 사용됩니다. 사용 가능한 기능에는 일치, 검색, 찾기 및 Finditer가 포함됩니다.

사용된 구문

여기에 사용된 정규식 함수는 다음과 같습니다. 정규식 함수를 사용하여 일치하는 항목을 찾습니다.

re.match(): Determines if the RE matches at the beginning of the string. If zero or more characters at the beginning of the string match the regular expression pattern, the match method returns a match object.

p.finditer(): Finds all substrings where the RE matches and returns them as an iterator. An iterator delivering match objects across all non-overlapping matches for the pattern in a string is the result of the finditer method.

re.compile(): Compile a regular expression pattern into a regular expression object, which can be used for matching using its match(), search(), and other methods described below. The expression’s behavior can be modified by specifying a flag&#39;s value. Values can be any of the following variables combined using bitwise OR (the | operator).

m.start(): m.start() returns the offset in the string at the match&#39;s start.

m.group(): You may use the multiple-assignment approach to assign each value to a different variable when mo.groups() returns a tuple of values, as in the areaCode, mainNumber = mo.groups() line below.

search: It is comparable to re.match() but does not require that we just look for matches at the beginning of the text. The search() function can locate a pattern in the string at any location, but it only returns the first instance of the pattern.

Algorithm

import re를 사용하여 정규식 모듈을 가져옵니다.
re.compile() 함수를 사용하여 정규식 개체를 만듭니다. (원시 문자열을 사용해야 합니다.)
검색하려는 문자열을 Regex 개체의 finditer() 메서드에 전달합니다. 그러면 Match 개체가 반환됩니다.
실제로 일치하는 텍스트 문자열을 반환하려면 Match 개체의 group() 메서드를 호출하세요.
span() 메서드를 사용하여 튜플의 시작 및 끝 인덱스를 가져올 수도 있습니다.

예

 #importing re functions
import re
#compiling [A-Z0-9] and storing it in a variable p
p = re.compile("[A-Z0-9]")
#looping m times in p.finditer
for m in p.finditer(&#39;A5B6C7D8&#39;):
#printing the m.start and m.group
   print m.start(), m.group()

Output

이렇게 하면 출력이 생성됩니다. −

코드 설명

import re를 사용하여 정규식 모듈을 가져옵니다. re.compile() 함수를 사용하여 정규식 개체("[A-Z0-9]")를 만들고 이를 변수 p에 할당합니다. 루프를 사용하여 m을 반복하고 검색하려는 문자열을 정규식 객체의 finditer() 메서드에 전달합니다. 그러면 Match 개체가 반환됩니다. Match 객체의 m.group() 및 m.start() 메서드를 호출하여 실제로 텍스트와 일치하는 문자열을 반환합니다.

Example

# Python program to illustrate
# Matching regex objects
# with groups
import re
phoneNumRegex = re.compile(r&#39;(\d\d\d)-(\d\d\d-\d\d\d\d)&#39;)
mo = phoneNumRegex.search(&#39;My number is 415-555-4242.&#39;)
print(mo.groups())

Output

이렇게 하면 출력이 생성됩니다. −

(&#39;415&#39;, &#39;555-4242&#39;)

코드 설명

import re를 사용하여 정규식 모듈을 가져옵니다. re.compile() 함수를 사용하여 정규식 개체(r'(ddd)-(ddd-dddd)')를 만들고 이를 변수phoneNumRegex에 할당합니다. 검색할 문자열을 Regex 개체의 search() 메서드에 전달하고 mo 변수에 저장합니다. 그러면 Match 개체가 반환됩니다. 실제로 일치하는 텍스트 문자열을 반환하려면 Match 객체의 mo.groups() 메서드를 호출하세요.

결론

Python re 모듈에서 제공하는 search(), match() 및 finditer() 메서드를 사용하면 정규식 패턴을 일치시킬 수 있으며 일치에 성공하면 Match 개체 인스턴스를 제공합니다. 일치하는 문자열에 대한 자세한 정보를 얻으려면 이 Match 객체의 start(), end() 및span() 메서드를 사용하십시오.

일치하는 항목이 많을 때 findall()을 사용하여 모두 로드하면 메모리 과부하가 발생할 위험이 있습니다. finditer() 메서드를 사용하면 잠재적으로 일치하는 모든 항목의 반복자 개체를 얻을 수 있으므로 효율성이 향상됩니다.

이것은 finditer()가 호출될 때 결과를 메모리에 로드하는 호출 가능한 객체를 제공한다는 것을 의미합니다.

위 내용은 Python의 정규 표현식에서 각 일치 항목의 정확한 위치를 어떻게 찾나요?의 상세 내용입니다. 자세한 내용은 PHP 중국어 웹사이트의 기타 관련 기사를 참조하세요!

성명：

이 기사는 tutorialspoint.com에서 복제됩니다. 침해가 있는 경우 admin@php.cn으로 문의하시기 바랍니다. 삭제

이전 기사：Python에서 빈 클래스를 만드는 방법은 무엇입니까?다음 기사：Python에서 빈 클래스를 만드는 방법은 무엇입니까?