Home  >  Article  >  Backend Development  >  Python regular expression【1】

Python regular expression【1】

黄舟
黄舟Original
2017-02-07 16:28:531345browse

This article talks about Python’s regular expressions.

No more nonsense, let’s start with the simplest one:

'.': can match any single character (just a dot) except line breaks.

'*' can match the previous subexpression zero or more times (just an asterisk).

So the combination of the above two '.*' (dot star) matches everything except newline characters.

'+': Repeat one or more times.

'?': Repeat zero or one time.

'\d': Matches a numeric character. Equivalent to [0-9].

'\w' matches any word character including an underscore. Equivalent to '[A-Za-z0-9_]'.

'/s' matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [ \f\n\r\t\v]

'^' matches the beginning of the input string.

'$' matches the end position of the input string.

The above are very commonly used, of course there are many more commonly used ones, please check the manual when needed.

This description is not intuitive enough, so let’s do the experiment directly. It is very simple to use regular expressions in Python, just import re directly:

>>> import re

>>>

First Try matching all:

>>> vlan = 'switchport access vlan 612'
>>> ljds = re.search('.*',vlan).group()
>>> ljds
'switchport access vlan 612'

Try matching numbers again:

>>> ljds = re.search('\d',vlan).group()
>>> ljds
'6'

Because '/d' matches a number, so if you want to match '612' here, three For numbers, you can add '{3}':

>>> ljds = re.search('\d{3}',vlan).group()
>>> ljds

'612'

Similarly, if you want to match 13 characters (including spaces):

>>> ljds = re.search('[\w\s]{13}',vlan).group()
>>> ljds
'switchport ac'

Here are also I would like to mention that the quantifiers of regular expressions involve greedy and non-greedy modes. Greedy is to take the maximum value and match as many matches as possible. Non-greedy is just the opposite (the default is greedy mode). For example:

The above is matching 13 characters. If it is written to match 2 to 10 characters, just write: '[\w\s]{2,10}', then what is matched is 2 One or 10? Because the default is greedy mode, it will match the maximum:

>>> ljds = re.search('[\w\s]{2,10}',vlan).group()
>>> ljds
'switchport'

Add a question mark '?' after the quantifier to switch to non-greedy mode, that is, the minimum match:

>>> ljds = re.search('[\w\s]{2,10}?',vlan).group()
>>> ljds
'sw'

Next, let’s introduce “capture”:

(exp): Match exp.

(?=exp): Match the position before exp.

(?fd6affae8849c720b3eb918236785239>> vlan = 'switchport access vlan 612'

The most basic:

>>> ljds = re.search('(access)',vlan).group()
>>> ljds
'access'

Match any character before 'access':

>>> ljds = re.search('.*(?=access)',vlan).group()
>>> ljds
'switchport '

Matches any characters after 'vlan':

>>> ljds = re.search(&#39;(?<=vlan).*&#39;,vlan).group()
>>> ljds
&#39; 612&#39;

OK, after learning this, look at the regular expression that captured the router name before:

DeviceName = re.search(&#39;.*(?=#show run)&#39;,telreply).group()

The above is the content of Python regular expression [1]. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn