Home  >  Article  >  Backend Development  >  Understanding Python multi-line matching patterns

Understanding Python multi-line matching patterns

Guanhui
Guanhuiforward
2020-07-24 17:22:332983browse

Understanding Python multi-line matching patterns

Question

You are trying to use a regular expression to match a large block of text, and you need Match across multiple lines.

Solution

This problem typically occurs when you use dot (.) to match any character, but forget that dot (.) cannot match newlines. conforming facts. For example, suppose you want to try to match the C-delimited comment:

>>> comment = re.compile(r&#39;/\*(.*?)\*/&#39;)<br/>>>> text1 = &#39;/* this is a comment */&#39;<br/>>>> text2 = &#39;&#39;&#39;/* this is a<br/>... multiline comment */<br/>... &#39;&#39;&#39;<br/>>>><br/>>>> comment.findall(text1)<br/>[&#39; this is a comment &#39;]<br/>>>> comment.findall(text2)<br/>[]<br/>>>><br/>

To fix this problem, you can modify the pattern string to add support for newlines. For example:

>>> comment = re.compile(r&#39;/\*((?:.|\n)*?)\*/&#39;)<br/>>>> comment.findall(text2)<br/>[&#39; this is a\n multiline comment &#39;]<br/>>>><br/>

In this pattern, (?:.|\n) specifies a non-capturing group (that is, it defines a group that is only used for matching and cannot be captured or numbered individually. ).

Discussion

re.compile() The function accepts a flag parameter called re.DOTALL , which is very useful here . It allows . in regular expressions to match any character including newlines. For example:

>>> comment = re.compile(r&#39;/\*(.*?)\*/&#39;, re.DOTALL)<br/>>>> comment.findall(text2)<br/>[&#39; this is a\n multiline comment &#39;]<br/>

For simple cases using re.DOTALL tag parameters work well, but if the pattern is very complex or multiple patterns are combined to construct a string token (Detailed description in Section 2.18). At this time, some problems may occur when using this mark parameter. If you have a choice, it's better to define your own regular expression pattern so that it works well without the need for additional marker parameters.

Recommended tutorial: "Python Tutorial"

The above is the detailed content of Understanding Python multi-line matching patterns. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:jb51.net. If there is any infringement, please contact admin@php.cn delete