Home  >  Article  >  Backend Development  >  php — PCRE regex anchor and period

php — PCRE regex anchor and period

伊谢尔伦
伊谢尔伦Original
2016-11-21 17:24:071107browse

Anchor

Outside a character class, in the default matching mode, ^ is an assertion that the current matching point is at the beginning of the target string. Within a character class, ^ indicates the negation of the character described in the character class (see below for details).

^ does not have to be the first character of the pattern, but if it is in an optional branch, it should be the first character of the branch. If all alternative branches begin with ^ , that is, if the pattern is restricted to matching only the beginning of the target, it is said to be a "tightened" pattern. (There are also other ways to construct fastening patterns)

$ is used to assert that the current matching point is at the end of the target string, or when the target string ends with a newline character, the current matching point is at the newline position (default) . $ does not have to be the last character of the pattern, but if it is in an optional branch, it should be at the end of that branch. $ has no special meaning in character classes. The meaning of

$ can be changed to only match the end of the string by setting PCRE_DOLLAR_ENDONLY during compilation or matching. This does not affect the behavior of Z assertions. The meaning of the

^ and $ characters changes when the PCRE_MULTILINE option is set. When this is the case, they match the characters after and before each newline character, and also the beginning and end of the target string. For example, the pattern /^abc$/ will successfully match the target string "defnabc" in multiline mode, but not normally. Therefore, since all optional branches start with ^ , in single-line mode this becomes a fastened mode, whereas in multi-line mode, it is unfastened. The PCRE_DOLLAR_ENDONLY option becomes invalid after PCRE_MULTILINE is set.

Note: Escape sequences such as A, Z, z can be used to match the beginning and end of the target string in any mode. And if all branches of the pattern start with A, it is also tight, regardless of whether PCRE_MULTILINE is set.

Period

Outside of a character class, a period in the pattern matches any character in the target string, including nonprinting characters, but (by default) excluding newlines. If PCRE_DOTALL is set, periods will match newlines. The processing of periods is not related to the processing of ^ and $. Their only relationship is that they both involve newlines. Periods have no meaning in character classes.

C can be used to match single bytes, which means that in UTF-8 mode, periods can match multi-byte characters.


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn