


Introduction to Python regular expressions and re library (code examples)
This article brings you an introduction (code example) about Python regular expressions and re library. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. .
A regular expression is a sequence of characters that defines a search pattern. Typically this pattern is used by string search algorithms for "find" or "find and replace" operations on strings, or for input validation.
1. Regular expression syntax
. Represents any single character
[] Character set, giving a value range for a single character
[^] Non-character set, giving an exclusion range for a single character
* The previous character is expanded 0 times or infinitely
The previous character is expanded 1 time or infinitely
? The previous character is expanded 0 times or 1 Times expansion
|Any one of the left and right expressions
{m}expands the previous character m times
-
{m,n}Expand the previous character m to n times
^match the beginning of the string
$match the end of the string
() grouping mark, only the | operator can be used internally
-
d number, equivalent to [0-9]
w word characters, equivalent to [A-Z,a-z,0-9]
2. Use of re library in python
Re library is the standard library of python, mainly used for string matching. Calling method: import re
2.1. Type of regular expression string
re library The raw string type is used to represent regular expressions, expressed as
r'text'
raw string is a string that does not contain escape characters again, in short, it is string Characters will be escaped, but raw string will not, because escape symbols will appear in regular expressions, so to avoid tediousness we use raw string
2.2. Re library main function function
re.search() Searches for the first position of a regular expression in a string and returns the match object
re .match() Matches the regular expression from the beginning of a string and returns the match object
re.findall()Search for the string, Return all matching substrings in list type
re.split()Split a string according to the regular expression matching result and return list type
re.finditer()Search for a string and return an iteration type of matching results. Each iteration element is a match object
-
re.sub()Replace all substrings matching the regular expression in a string and return the replaced string
2.2.1 . re.search(pattern, string, flags=0)
Search for the first position of the regular expression in a string and return the match object
- ##pattern : The string or native string representation of the regular expression
- string : The string to be matched
- flags : When the regular expression is used Control tag
- re.I re.IGNORECASE Ignore the case of regular expressions, [A‐Z] can match lowercase characters
- re .M re.MULTILINE The ^ operator in regular expressions can start each line of a given string as a match
- re.S re.DOTALL The . operation in regular expressions character can match all characters, and the default matches all characters except newlines
import re match = re.search(r'[1-9]\d{5}', 'BIT 100081') if match: print(match.group(0)) 结果为1000812.2.2. re.match(pattern, string, flags= 0) Match the regular expression from the beginning of a string and return the match object
The parameters are the same as the search function
Example:
import re match = re.match(r'[1-9]\d{5}', 'BIT 100081') print(match.group(0)) 结果会报错,match为空,因为match函数是 从字符串开始位置开始匹配,因为从开始位置没有匹配到,所以为空2.2.3. re. findall(pattern, string, flags=0)Search for string and return all matching substrings in list type
The parameters are the same as search
Example:
import re ls=re.findall(r'[1-9]\d{5}', 'BIT100081 TSU100084') print(ls) 结果为['100081', '100084']2.2 .4. re.split(pattern, string, maxsplit=0, flags=0)Split a string according to the regular expression matching result and return the list type
- maxsplit: The maximum number of splits, the remaining part is output as the last element
import re re.split(r'[1-9]\d{5}', 'BIT100081 TSU100084') 结果['BIT', ' TSU', ' '] re.split(r'[1-9]\d{5}', 'BIT100081 TSU100084', maxsplit=1) 结果['BIT', ' TSU100081']2.2.5. re.finditer(pattern, string, maxsplit =0, flags=0)Search for a string and return an iteration type of matching results. Each iteration element is a match object
The parameters are the same as search
Example:
import re for m in re.finditer(r'[1-9]\d{5}', 'BIT100081 TSU100084'): if m: print(m.group(0)) 结果为 100081 1000842.2.6. re.sub(pattern, repl, string, count=0, flags=0)Replace all substrings matching the regular expression in a string and return the replaced string
- repl: Replace the string that matches the string
- count: The maximum number of replacements for the match
import re re.sub(r'[1-9]\d{5}', ':zipcode', 'BIT100081 TSU100084') 结果为 'BIT:zipcode TSU:zipcode'2.3 Another equivalent usage of Re library (object-oriented)
rst=re.search(r'[1-9]\d{5}', 'BIT 100081')
函数式的调用,一次性操作
pat=re.compile(r'[1-9]\d{5}')
rst=pat.search('BIT 100081')
编译后多次操作
regex=re.complie(pattern,flags=0)regex also has the above Six usages
The following is Match Attributes of the object
- .string Text to be matched
- .re Patter object used for matching (regular expression Mode)
.pos The starting position of the regular expression search text
.endpos The end position of the regular expression search text
The following are the methods of the Match object
.group(0) Get the matched string
.start() Matches the string at the beginning of the original string
.end() Matches the string at the end of the original string
.span() returns (.start(), .end())
2.5 Greedy matching and minimum matching of Re library
When a regular expression can match multiple items of different lengths, which one is returned? The Re library uses greedy matching by default, that is, it returns the longest matching substring
the smallest matching
*? before A character is expanded 0 times or infinitely, and the minimum match is
? The previous character is expanded 1 time or infinitely, and the minimum match is
- ##?? The previous character is expanded 0 or 1 times, the minimum match is
- {m,n}? The previous character is expanded m to n times (inclusive), the minimum match is
As long as the length output may be different, you can add ? after the operator to become the minimum match
The above is the detailed content of Introduction to Python regular expressions and re library (code examples). For more information, please follow other related articles on the PHP Chinese website!

Python is an interpreted language, but it also includes the compilation process. 1) Python code is first compiled into bytecode. 2) Bytecode is interpreted and executed by Python virtual machine. 3) This hybrid mechanism makes Python both flexible and efficient, but not as fast as a fully compiled language.

Useaforloopwheniteratingoverasequenceorforaspecificnumberoftimes;useawhileloopwhencontinuinguntilaconditionismet.Forloopsareidealforknownsequences,whilewhileloopssuitsituationswithundeterminediterations.

Pythonloopscanleadtoerrorslikeinfiniteloops,modifyinglistsduringiteration,off-by-oneerrors,zero-indexingissues,andnestedloopinefficiencies.Toavoidthese:1)Use'i

Forloopsareadvantageousforknowniterationsandsequences,offeringsimplicityandreadability;whileloopsareidealfordynamicconditionsandunknowniterations,providingcontrolovertermination.1)Forloopsareperfectforiteratingoverlists,tuples,orstrings,directlyacces

Pythonusesahybridmodelofcompilationandinterpretation:1)ThePythoninterpretercompilessourcecodeintoplatform-independentbytecode.2)ThePythonVirtualMachine(PVM)thenexecutesthisbytecode,balancingeaseofusewithperformance.

Pythonisbothinterpretedandcompiled.1)It'scompiledtobytecodeforportabilityacrossplatforms.2)Thebytecodeistheninterpreted,allowingfordynamictypingandrapiddevelopment,thoughitmaybeslowerthanfullycompiledlanguages.

Forloopsareidealwhenyouknowthenumberofiterationsinadvance,whilewhileloopsarebetterforsituationswhereyouneedtoloopuntilaconditionismet.Forloopsaremoreefficientandreadable,suitableforiteratingoversequences,whereaswhileloopsoffermorecontrolandareusefulf

Forloopsareusedwhenthenumberofiterationsisknowninadvance,whilewhileloopsareusedwhentheiterationsdependonacondition.1)Forloopsareidealforiteratingoversequenceslikelistsorarrays.2)Whileloopsaresuitableforscenarioswheretheloopcontinuesuntilaspecificcond


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Dreamweaver CS6
Visual web development tools

WebStorm Mac version
Useful JavaScript development tools

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),
