Introduction to Python regular expressions and re library (code examples)-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

Introduction to Python regular expressions and re library (code examples)

不言

Feb 11, 2019 am 10:33 AM

pythonregular expression

This article brings you an introduction (code example) about Python regular expressions and re library. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. .

A regular expression is a sequence of characters that defines a search pattern. Typically this pattern is used by string search algorithms for "find" or "find and replace" operations on strings, or for input validation.

1. Regular expression syntax

. Represents any single character
[] Character set, giving a value range for a single character
[^] Non-character set, giving an exclusion range for a single character
* The previous character is expanded 0 times or infinitely
The previous character is expanded 1 time or infinitely
? The previous character is expanded 0 times or 1 Times expansion
|Any one of the left and right expressions
{m}expands the previous character m times
{m,n}Expand the previous character m to n times
^match the beginning of the string
$match the end of the string
() grouping mark, only the | operator can be used internally
d number, equivalent to [0-9]
w word characters, equivalent to [A-Z,a-z,0-9]

2. Use of re library in python

Re library is the standard library of python, mainly used for string matching. Calling method: import re

2.1. Type of regular expression string

re library The raw string type is used to represent regular expressions, expressed as
r'text'
raw string is a string that does not contain escape characters again, in short, it is string Characters will be escaped, but raw string will not, because escape symbols will appear in regular expressions, so to avoid tediousness we use raw string

2.2. Re library main function function

re.search() Searches for the first position of a regular expression in a string and returns the match object
re .match() Matches the regular expression from the beginning of a string and returns the match object
re.findall()Search for the string, Return all matching substrings in list type
re.split()Split a string according to the regular expression matching result and return list type
re.finditer()Search for a string and return an iteration type of matching results. Each iteration element is a match object
re.sub()Replace all substrings matching the regular expression in a string and return the replaced string

2.2.1 . re.search(pattern, string, flags=0)

Search for the first position of the regular expression in a string and return the match object

##pattern : The string or native string representation of the regular expression
string : The string to be matched
flags : When the regular expression is used Control tag
re.I re.IGNORECASE Ignore the case of regular expressions, [A‐Z] can match lowercase characters
re .M re.MULTILINE The ^ operator in regular expressions can start each line of a given string as a match
re.S re.DOTALL The . operation in regular expressions character can match all characters, and the default matches all characters except newlines

Example:

import re
match = re.search(r'[1-9]\d{5}', 'BIT 100081')
if match:
    print(match.group(0))

结果为100081

2.2.2. re.match(pattern, string, flags= 0)

Match the regular expression from the beginning of a string and return the match object

The parameters are the same as the search function
Example:

import re
match = re.match(r'[1-9]\d{5}', 'BIT 100081')
print(match.group(0))

结果会报错，match为空，因为match函数是
从字符串开始位置开始匹配，因为从开始位置没有匹配到，所以为空

2.2.3. re. findall(pattern, string, flags=0)

Search for string and return all matching substrings in list type

The parameters are the same as search
Example:

import re
ls=re.findall(r'[1-9]\d{5}', 'BIT100081 TSU100084')
print(ls)

结果为['100081', '100084']

2.2 .4. re.split(pattern, string, maxsplit=0, flags=0)

Split a string according to the regular expression matching result and return the list type

maxsplit: The maximum number of splits, the remaining part is output as the last element

Example:

import re
re.split(r'[1-9]\d{5}', 'BIT100081 TSU100084')
结果['BIT', ' TSU', ' ']
re.split(r'[1-9]\d{5}', 'BIT100081 TSU100084', maxsplit=1)
结果['BIT', ' TSU100081']

2.2.5. re.finditer(pattern, string, maxsplit =0, flags=0)

Search for a string and return an iteration type of matching results. Each iteration element is a match object

The parameters are the same as search
Example:

import re
for m in re.finditer(r'[1-9]\d{5}', 'BIT100081 TSU100084'):
    if m:
        print(m.group(0))
结果为
100081
100084

2.2.6. re.sub(pattern, repl, string, count=0, flags=0)

Replace all substrings matching the regular expression in a string and return the replaced string

repl: Replace the string that matches the string
count: The maximum number of replacements for the match

Example:

import re
re.sub(r'[1-9]\d{5}', ':zipcode', 'BIT100081 TSU100084')
结果为
'BIT:zipcode TSU:zipcode'

2.3 Another equivalent usage of Re library (object-oriented)

rst=re.search(r'[1-9]\d{5}', 'BIT 100081')
函数式的调用，一次性操作

pat=re.compile(r'[1-9]\d{5}')
rst=pat.search('BIT 100081')
编译后多次操作

regex=re.complie(pattern,flags=0)

regex also has the above Six usages

2.4 Match object of Re library

Match object is the result of a match and contains a lot of matching information

The following is Match Attributes of the object

.string Text to be matched
.re Patter object used for matching (regular expression Mode)
.pos The starting position of the regular expression search text
.endpos The end position of the regular expression search text

The following are the methods of the Match object

.group(0) Get the matched string
.start() Matches the string at the beginning of the original string
.end() Matches the string at the end of the original string
.span() returns (.start(), .end())

2.5 Greedy matching and minimum matching of Re library

When a regular expression can match multiple items of different lengths, which one is returned? The Re library uses greedy matching by default, that is, it returns the longest matching substring

the smallest matching

*? before A character is expanded 0 times or infinitely, and the minimum match is
? The previous character is expanded 1 time or infinitely, and the minimum match is
##?? The previous character is expanded 0 or 1 times, the minimum match is
{m,n}? The previous character is expanded m to n times (inclusive), the minimum match is

As long as the length output may be different, you can add ? after the operator to become the minimum match

The above is the detailed content of Introduction to Python regular expressions and re library (code examples). For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:segmentfault. If there is any infringement, please contact admin@php.cn delete

Python: compiler or Interpreter?May 13, 2025 am 12:10 AM

Python is an interpreted language, but it also includes the compilation process. 1) Python code is first compiled into bytecode. 2) Bytecode is interpreted and executed by Python virtual machine. 3) This hybrid mechanism makes Python both flexible and efficient, but not as fast as a fully compiled language.

Python For Loop vs While Loop: When to Use Which?May 13, 2025 am 12:07 AM

Useaforloopwheniteratingoverasequenceorforaspecificnumberoftimes;useawhileloopwhencontinuinguntilaconditionismet.Forloopsareidealforknownsequences,whilewhileloopssuitsituationswithundeterminediterations.

Python loops: The most common errorsMay 13, 2025 am 12:07 AM

Pythonloopscanleadtoerrorslikeinfiniteloops,modifyinglistsduringiteration,off-by-oneerrors,zero-indexingissues,andnestedloopinefficiencies.Toavoidthese:1)Use'i

For loop and while loop in Python: What are the advantages of each?May 13, 2025 am 12:01 AM

Forloopsareadvantageousforknowniterationsandsequences,offeringsimplicityandreadability;whileloopsareidealfordynamicconditionsandunknowniterations,providingcontrolovertermination.1)Forloopsareperfectforiteratingoverlists,tuples,orstrings,directlyacces

Python: A Deep Dive into Compilation and InterpretationMay 12, 2025 am 12:14 AM

Pythonusesahybridmodelofcompilationandinterpretation:1)ThePythoninterpretercompilessourcecodeintoplatform-independentbytecode.2)ThePythonVirtualMachine(PVM)thenexecutesthisbytecode,balancingeaseofusewithperformance.

Is Python an interpreted or a compiled language, and why does it matter?May 12, 2025 am 12:09 AM

Pythonisbothinterpretedandcompiled.1)It'scompiledtobytecodeforportabilityacrossplatforms.2)Thebytecodeistheninterpreted,allowingfordynamictypingandrapiddevelopment,thoughitmaybeslowerthanfullycompiledlanguages.

For Loop vs While Loop in Python: Key Differences ExplainedMay 12, 2025 am 12:08 AM

Forloopsareidealwhenyouknowthenumberofiterationsinadvance,whilewhileloopsarebetterforsituationswhereyouneedtoloopuntilaconditionismet.Forloopsaremoreefficientandreadable,suitableforiteratingoversequences,whereaswhileloopsoffermorecontrolandareusefulf

For and While loops: a practical guideMay 12, 2025 am 12:07 AM

Forloopsareusedwhenthenumberofiterationsisknowninadvance,whilewhileloopsareusedwhentheiterationsdependonacondition.1)Forloopsareidealforiteratingoversequenceslikelistsorarrays.2)Whileloopsaresuitableforscenarioswheretheloopcontinuesuntilaspecificcond

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Dreamweaver CS6

Visual web development tools

WebStorm Mac version

Useful JavaScript development tools

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),