search
HomeBackend DevelopmentPython TutorialPython regular expressions: greedy/non-greedy properties

Python regular expressions: greedy/non-greedy properties

Mar 03, 2017 am 11:46 AM
python regular expression

之前已经简单介绍了Python正则表达式的基础与捕获,那么在这一篇文章里,我将总结一下正则表达式的贪婪/非贪婪特性。 

贪婪

默认情况下,正则表达式将进行贪婪匹配。所谓“贪婪”,其实就是在多种长度的匹配字符串中,选择较长的那一个。例如,如下正则表达式本意是选出人物所说的话,但是却由于“贪婪”特性,出现了匹配不当:

>>> sentence = """You said "why?" and I say "I don't know"."""
>>> re.findall(r'"(.*)"', sentence)
['why?" and I say "I don\'t know']

再比如,如下的几个例子都说明了正则表达式“贪婪”的特性:

>>> re.findall('hi*', 'hiiiii')
['hiiiii']
>>> re.findall('hi{2,}', 'hiiiii')
['hiiiii']
>>> re.findall('hi{1,3}', 'hiiiii')
['hiii']

非贪婪

当我们期望正则表达式“非贪婪”地进行匹配时,需要通过语法明确说明: 

      {2,5}?    捕获2-5次,但是优先次数少的匹配

在这里,问号?可能会有些让人犯晕,因为之前他已经有了自己的含义:前面的匹配出现0次或1次。其实,只要记住,当问号出现在表现不定次数的正则表达式部分之后时,就表示非贪婪匹配。 

还是上面的那几个例子,用非贪婪匹配,则结果如下:

>>> re.findall('hi*?', 'hiiiii')
['h']
>>> re.findall('hi{2,}?', 'hiiiii')
['hii']
>>> re.findall('hi{1,3}?', 'hiiiii')
['hi']

另外一个例子中,使用非贪婪匹配,结果如下:

>>> sentence = """You said "why?" and I say "I don't know"."""
>>> re.findall(r'"(.*?)"', sentence)
['why?', "I don't know"]

捕获与非贪婪

严格来说,这一部分并不是非贪婪特性。但是由于其行为与非贪婪类似,所以为了方便记忆,就将其放在一起了。 

      (?=abc) 捕获,但不消耗字符,且匹配abc

      (?!abc) 捕获,不消耗,且不匹配abc

在正则表达式匹配的过程中,其实存在“消耗字符”的过程,也就是说,一旦一个字符在匹配过程中被检索(消耗)过,后面的匹配就不会再检索这一字符了。 

知道这个特性有什么用呢?还是用例子说明。比如,我们想找出字符串中出现过1次以上的单词:

>>> sentence = "Oh what a day, what a lovely day!"
>>> re.findall(r'\b(\w+)\b.*\b\1\b', sentence)
['what']

这样的正则表达式显然无法完成任务。为什么呢?原因就是,在第一个(\w+)匹配到what,并且其后的\1也匹配到第二个what的时候,“Oh what a day, what”这一段子串都已经被正则表达式消耗了,所以之后的匹配,将直接从第二个what之后开始。自然地,这里只能找出一个出现了两次的单词。 

那么解决方案,就和上面提到的(?=abc)语法相关了。这样的语法可以在分组匹配的同时,不消耗字符串!所以,正确的书写方式应该是:

>>> re.findall(r'\b(\w+)\b(?=.*\b\1\b)', sentence)
['what', 'a', 'day']

如果我们需要匹配一个至少包含两个不同字母的单词,则可以使用(?!abc)的语法:

>>> re.search(r'([a-z]).*(?!\1)[a-z]', 'aa', re.IGNORECASE)
>>> re.search(r'([a-z]).*(?!\1)[a-z]', 'ab', re.IGNORECASE)
<_sre.SRE_Match object; span=(0, 2), match=&#39;ab&#39;>

更多Python正则表达式:贪婪/非贪婪特性相关文章请关注PHP中文网!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
What data types can be stored in a Python array?What data types can be stored in a Python array?Apr 27, 2025 am 12:11 AM

Pythonlistscanstoreanydatatype,arraymodulearraysstoreonetype,andNumPyarraysarefornumericalcomputations.1)Listsareversatilebutlessmemory-efficient.2)Arraymodulearraysarememory-efficientforhomogeneousdata.3)NumPyarraysareoptimizedforperformanceinscient

What happens if you try to store a value of the wrong data type in a Python array?What happens if you try to store a value of the wrong data type in a Python array?Apr 27, 2025 am 12:10 AM

WhenyouattempttostoreavalueofthewrongdatatypeinaPythonarray,you'llencounteraTypeError.Thisisduetothearraymodule'sstricttypeenforcement,whichrequiresallelementstobeofthesametypeasspecifiedbythetypecode.Forperformancereasons,arraysaremoreefficientthanl

Which is part of the Python standard library: lists or arrays?Which is part of the Python standard library: lists or arrays?Apr 27, 2025 am 12:03 AM

Pythonlistsarepartofthestandardlibrary,whilearraysarenot.Listsarebuilt-in,versatile,andusedforstoringcollections,whereasarraysareprovidedbythearraymoduleandlesscommonlyusedduetolimitedfunctionality.

What should you check if the script executes with the wrong Python version?What should you check if the script executes with the wrong Python version?Apr 27, 2025 am 12:01 AM

ThescriptisrunningwiththewrongPythonversionduetoincorrectdefaultinterpretersettings.Tofixthis:1)CheckthedefaultPythonversionusingpython--versionorpython3--version.2)Usevirtualenvironmentsbycreatingonewithpython3.9-mvenvmyenv,activatingit,andverifying

What are some common operations that can be performed on Python arrays?What are some common operations that can be performed on Python arrays?Apr 26, 2025 am 12:22 AM

Pythonarrayssupportvariousoperations:1)Slicingextractssubsets,2)Appending/Extendingaddselements,3)Insertingplaceselementsatspecificpositions,4)Removingdeleteselements,5)Sorting/Reversingchangesorder,and6)Listcomprehensionscreatenewlistsbasedonexistin

In what types of applications are NumPy arrays commonly used?In what types of applications are NumPy arrays commonly used?Apr 26, 2025 am 12:13 AM

NumPyarraysareessentialforapplicationsrequiringefficientnumericalcomputationsanddatamanipulation.Theyarecrucialindatascience,machinelearning,physics,engineering,andfinanceduetotheirabilitytohandlelarge-scaledataefficiently.Forexample,infinancialanaly

When would you choose to use an array over a list in Python?When would you choose to use an array over a list in Python?Apr 26, 2025 am 12:12 AM

Useanarray.arrayoveralistinPythonwhendealingwithhomogeneousdata,performance-criticalcode,orinterfacingwithCcode.1)HomogeneousData:Arrayssavememorywithtypedelements.2)Performance-CriticalCode:Arraysofferbetterperformancefornumericaloperations.3)Interf

Are all list operations supported by arrays, and vice versa? Why or why not?Are all list operations supported by arrays, and vice versa? Why or why not?Apr 26, 2025 am 12:05 AM

No,notalllistoperationsaresupportedbyarrays,andviceversa.1)Arraysdonotsupportdynamicoperationslikeappendorinsertwithoutresizing,whichimpactsperformance.2)Listsdonotguaranteeconstanttimecomplexityfordirectaccesslikearraysdo.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),