Home >Backend Development >PHP Tutorial >php — PCRE regular expression one-time subgrouping
For duplicates with both maximum and minimum quantifier restrictions, after the match fails, another number of repetitions will be used to re-evaluate whether the pattern can be matched. When the pattern author knows for sure that there is no problem with implementation, it is useful to prevent this behavior by changing the behavior of the match or causing earlier matches to fail.
Consider an example, when the pattern d+foo is applied to the target row 123456bar:
Fails to match "foo" after matching 6 digits. The usual behavior is that the matcher tries to make d+ match only 5 digits, and only matches 4 numbers, tried in sequence before ultimately failing. One-shot subgroups provide a special meaning in that once part of the pattern is matched, it is not re-evaluated, so the matcher can fail immediately after the first failure to match "foo". Syntax symbols are another special kind of brackets, starting with (?>, such as (?>d+)bar.
This kind of brackets provide a "lock" on part of the pattern, which will prevent it from containing a match. The backward traceback inside the future pattern fails here, and other work continues as usual. In other words, if the current matching point in the target string is an anchor point, this type of subgroup. The matched string is equivalent to a standalone pattern match.
A one-time subgroup is not a capturing subgroup. In simple terms, it eats as many matching characters as it can. Therefore, although d+ and d+? will adjust the number of digits to be matched so that other parts of the pattern match, but (?>d+) can only match the entire sequence of digits.
This (grammatical) structure can contain characters of any complexity and can also be embedded. Set.
One-time subgroups can be used with lookahead assertions to specify a valid match at the end of the target string. Consider when a simple pattern such as abcd$ is applied to a long string that does not match from the left. Handled to the right, PCRE looks for each "a" in the target and then checks to see if the remainder of the pattern matches immediately. If the pattern is ^.*abcd$, then the initial .* will match the entire string first, but When it fails (because it is not followed by "a"), it will backtrack through all matches, spitting out the last character, the second to last character, and so on, searching for "a" in the entire string from right to left. Therefore, we can't exit nicely. However, if the pattern is written ^(?>.*)(?<=abcd) then it will not backtrack the .* part, it will just match the whole string. The predicate does a test on the last four characters at the end of the string. If it fails, the match fails immediately. For long strings, this pattern will bring significant performance improvements in processing time.
When containing a subgroup that can repeat itself infinitely and has infinitely repeated elements inside it, using a one-time subgroup is the only way to avoid some failed matches that take a lot of time. The pattern (D+|