Home >Backend Development >PHP Tutorial >Parsing regular expressions and pattern matching in PHP_PHP Tutorial

Parsing regular expressions and pattern matching in PHP_PHP Tutorial

WBOY
WBOYOriginal
2016-07-21 15:05:321049browse

PHP provides two methods for regular processing of text, one is the PCRE method (the PCRE library is a set of functions that implement regular expression pattern matching functions that are slightly different from Perl 5 in syntax and semantics (see below for details) . The current implementation corresponds to perl 5.005.); the other is the POSIX way.

The functions in the PCRE library use pattern syntax very similar to perl. Expressions must be closed with a delimiter, such as a forward slash (/). The delimiter can be any non-alphanumeric character, except backslash Non-whitespace ASCII characters other than () and null bytes. If the delimiter is used in an expression, it needs to be escaped with a backslash. Since PHP 4.0.4, you can use perl-style (), {}, [] and <> are used as delimiters. For a more detailed explanation, see Pattern Syntax.

The end delimiter can be followed by pattern modifiers to affect the matching effect. See Pattern Modifiers.
Pattern modifier for PCRE
i (PCRE_CASELESS)
If this modifier is set, the characters in the pattern will match both uppercase and lowercase letters.
s (PCRE_DOTALL)
If this modifier is set, the dot metacharacter (.) in the pattern matches all characters, including newlines. Without this setting, newline characters are not included. This is equivalent to Perl's /s modifier. Excluded character classes such as [^a] always match newlines, regardless of whether this modifier is set.
m (PCRE_MULTILINE)
By default, PCRE treats the target string as a single "line" of characters (even if it contains newlines). The "start of line" metacharacter (^) only matches the beginning of the string, and the "end of line" metacharacter ($) only matches the end of the string, or the last character before it if it is a newline (unless D is set) modifier). This is the same as Perl. When this modifier is set, "line start" and "line end" match in addition to the beginning and end of the entire string, they also match after and before the newline character in it. This is equivalent to Perl's /m modifier. If there are no "n" characters in the target string or ^ or $ in the pattern, setting this modifier has no effect.
x (PCRE_EXTENDED)
If this modifier is set, whitespace data characters in the pattern that are not escaped or are not in a character class will always be ignored and placed in an unescaped Characters between a # character outside a defined character class and the next newline are also ignored. This modifier is equivalent to the /x modifier in Perl, which enables comments to be included in the compiled schema. Note: This is only used for data Characters. Blank characters still cannot appear in the special character sequence of the pattern, such as the sequence (?(Introduces a conditional subgroup (Annotation: If a blank character appears in the special character sequence defined by this syntax, it will cause a compilation error. For example (? (It will cause an error.).
e (PREG_REPLACE_EVAL)
If this modifier is set, preg_replace() will replace The following string is used as the line of PHP code evaluation (eval function mode), and the result of the line is used as the string actually involved in the replacement. Single quotes, double quotes, backslash () and NULL characters will be replaced during back reference replacement. Is escaped with a backslash.
Only preg_replace() uses this modifier, other PCRE functions ignore this modifier.
A (PCRE_ANCHORED)
If this modifier is set, the pattern is forced to be an "anchored" pattern, which means that the match is constrained so that it searches only from the beginning of the target string. This effect can also be constructed using the appropriate pattern, and is a perl feature The only way to implement this pattern.
D (PCRE_DOLLAR_ENDONLY)
If this modifier is set, the metacharacter dollar sign in the pattern only matches the end of the target string. If this modifier Without setting, when the string ends with a newline, the dollar sign will also match that newline (but not any preceding newline). If the m modifier is set, this modifier is ignored. Not available in perl Modifiers equivalent to this modifier.
S
When a pattern needs to be used multiple times, in order to improve the matching speed, it is worth spending some time to perform some additional analysis on it. . If this modifier is set, this additional analysis will be performed. Currently, this analysis of a pattern only applies to non-anchored pattern matches (i.e. without a single fixed start character).
U (PCRE_UNGREEDY)
This modifier reverses the "greedy" mode of the quantifier. It makes the quantifier non-greedy by default. It can be made greedy by following the quantifier?. This is different from perl. Compatible. It can also be set using the in-mode modifier setting (?U), or marking it non-greedy with a question mark after the quantifier (eg.*?).In non-greedy mode, usually no more than pcre can be matched. backtrack_limit characters.
Characters with no special meaning will cause an error, so these characters are preserved to ensure backward compatibility. By default, in Perl, a backslash followed by a character with no special meaning is considered to be the original text of the character. There are currently no other Features are controlled by this modifier.
J (PCRE_INFO_JCHANGED)
Internal option setting (?J) Modifies the local PCRE_DUPNAMES option. Allows subgroups to have the same name. (Annotation: Only through internal options settings, external /J settings will generate errors.)
u (PCRE8)
This modifier turns on an additional feature that is incompatible with perl. Pattern strings are considered to be utf-8 . This modifier is available from the unix version of php 4.1.0 or higher, and the win32 version of php 4.2.3. PHP 4.3.5 starts to check the utf-8 validity of the pattern. This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.

http://www.bkjia.com/PHPjc/327685.html

truehttp: //www.bkjia.com/PHPjc/327685.htmlTechArticlePHP provides two methods for regular processing of text, one is the PCRE method (the PCRE library is an implementation Regular expression pattern matching with slight differences in syntax and semantics from perl 5 (see below)...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn