Home >Backend Development >PHP Tutorial >php—PCRE regular expression character classes (square brackets) and optional paths (|)

php—PCRE regular expression character classes (square brackets) and optional paths (|)

伊谢尔伦
伊谢尔伦Original
2016-11-21 17:22:471376browse

Character class (square bracket)

The left square bracket starts the description of a character class and ends with the square bracket. A single right square bracket has no special meaning. If a right square bracket is required as a member of a character class, it can be written at the first character of the character class (or the second character if ^ is used to negate it) or use an escape character.

A character class matches a single character in the target string; the character must be one of the character sets defined in the character class, unless ^ is used to negate the character class. If ^ needs to be a member of a character class, make sure it is not the first character of the character class, or just escape it.

For example, the character class [aeiou] matches all lowercase vowel letters, while [^aeiou] matches all non-vowel characters. Note: ^ is just a convenience symbol for specifying characters that do not exist in a character class through an enumeration. Instead of asserting, it will still consume one character from the target string, and the match will fail if the current match point is at the end of the target string.

When case-insensitive matching is set, any character class represents both uppercase and lowercase versions, so for example, a case-insensitive [aeiou] matches both "A" and "a", and the uppercase and lowercase versions are not Sensitive [^aeiou] also does not match "A".

Newline characters have no special meaning in character classes and have nothing to do with the PCRE_DOTALL or PCRE_MULTILINE options. A character class such as [^a] will always match a newline character.

In character classes, a dash (minus sign -) can be used to specify a range from one character to another. For example, [d-m] matches all characters between d and m, and this set is closed. If the underscore itself is to be described in a character class, it must be shifted or appear in a position that is not interpreted as a range, typically at the beginning or end of the character class.

You cannot use the right bracket after a character range description. For example, a pattern [W-]46] is interpreted as a character class containing W and -, followed by the string "46]", so it can match "W46]" or "-46]". However, if the bracket is escaped, it will be interpreted as the end of the range, so [W-]46] will be interpreted as a single containing all characters in the range W to ] plus the characters 4 and 6 kind. Square brackets in octal or hexadecimal descriptions can also be used to end ranges.

Range operations are sorted in ASCII order. They can be used to assign numerical values ​​to characters, such as [

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn