Home >Backend Development >PHP Tutorial >Personal understanding of regular expressions - lazy matching, regular expression matching_PHP tutorial

Personal understanding of regular expressions - lazy matching, regular expression matching_PHP tutorial

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOriginal
2016-07-13 10:12:251135browse

Personal understanding of regular expressions - lazy matching, regular expression matching

Problem description

Link to this article: http://www.hcoding.com/?p=130

When I first learn regular expressions, I have a question. For example: I need to match the characters between the first pair of "_" in the string "_abc_123_". When I first started learning regular expressions, I would write "/_w*_/", the matching result is "abc_123" instead of "abc"; the master said to add a question mark, "/_w*?_/", then the matching result is "abc".

We know'? ' when used alone means: repeat zero or once, and when '? ' appears after the repeat qualifier, and its function is lazy matching, that is, matching as few characters as possible. Lazy qualifier description:

  • *?: Repeat any number of times, but repeat as little as possible
  • +?: Repeat 1 or more times, but repeat as little as possible
  • ??: Repeat 0 or 1 times, but repeat as little as possible
  • {n,m}?: Repeat n to m times, but repeat as little as possible
  • {n,}?: Repeat n times or more, but repeat as little as possible

Yes, "as few repetitions as possible", this is a crude and straightforward explanation of lazy matching.

So how do you understand “as little repetition as possible”? We can explain it from the ignored priority quantifier of regular expressions.

Ignore priority quantifier

The quantifiers "*?", "+?", "??", "{n,m}?", "{n,}?" are all ignored priority quantifiers. The ignored priority quantifiers are used in ?, It is composed of adding ? after +, *, {}. Ignore priority will first try to ignore when matching. If it fails, it will choose to try after backtracking. For example, if `ab??` matches "abb", it will get "a" instead of "ab". When the engine successfully matches a, because it ignores the priority, the engine first chooses not to match b, and continues to check the expression. If it finds that the expression has ended, the engine will directly report that the match was successful. Specifically, we use the following example to explain step by step the working principle of ignoring priority quantifiers.

Example

Still the above example, use "/_w*?_/" to match the characters between the first pair of "_" in "_abc_123_".

After starting to match the first '_', 'w*?' first decides that it does not need to match any characters because it ignores the priority quantifier. At this time, the expression '/_w*? The second '_' in _/' (the '_' after 'w*?') and the target string '_aThe 'a' in bc_123_' matches, and the match fails. Only then will 'w*?' be used to try the unmatched branch (use w to match a, and the attempt to match a is successful)

Next step, should we try to match or ignore it? Because 'w*?' ignores the priority quantifier and will choose to ignore it, then repeat the previous step. '_' fails to match b, and 'w*?' tries the unmatched branch ab. After repeating the above steps a total of 3 times ( Until the '_' after the expression 'w*?' matches the second '_' of the target string), 'abc' is finally matched.

Process (after starting to match the first '_'):

    The second '
  • _' in expression/_w*?_/' and the target string '_abc_123_' matches, the match fails, 'w*?' tries to match the target string '_abc_123_' 'a' in, the match is successful. The second '
  • _' in the expression /_w*?_/' and the target string '_abc_123_' matches, the match fails, 'w*?' tries to match the target string '_abc_123_' 'ab' in, the match is successful. The second '_
  • ' in the expression /_w*?_/' and the target string '_abc_123_' matches, the match fails, 'w*?' tries to match the target string '_abc_123_' 'abc' in, the match is successful. The second '_' in the expression /_w*?
  • _/' and the target string '_abc_123_' matches, the match is successful, and the match ends. The result is abc. The above are my thoughts after reading the section about ignoring priority quantifiers in "Mastering Regular Expressions". If I am wrong, I will humbly accept your advice. Thank you! Link to this article: http://www.hcoding.com/?p=130
  • Original article, please indicate: JC&hcoding.com

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/921718.htmlTechArticlePersonal understanding of regular expressions - lazy matching, regular expression matching problem description link to this article: http://www .hcoding.com/?p=130 When I first learn regular expressions, I have a question, for example...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn