Home >Backend Development >PHP Tutorial >Complete list and behavior description of regular expression metacharacters_PHP Tutorial
Character
Description
Mark the next character as a special character, text, backreference, or octal escape. For example, "n" matches the character "n". "n" matches a newline character. Involves input characters . * + ? | ( ) { }^ $, which needs to be preceded, such as: the sequence "\" matches "", "(" matches "(".
^
Matches the beginning of the input string. If the Multiline property of a RegExp object is set, ^ will also match the position after "n" or "r".
$
Matches the end of the input string. If the Multiline property of the RegExp object is set, $ will also match the position before "n" or "r".
*
Matches the preceding character or subexpression zero or more times. For example, zo* matches "z" and "zoo". * Equivalent to {0,}.
+
Matches the preceding character or subexpression one or more times. For example, "zo+" matches "zo" and "zoo" but not "z". + Equivalent to {1,}.
?
Matches the preceding character or subexpression zero or once times. For example, "do(es)?" matches the "do" in "do" or "does". ? Equivalent to {0,1}.
{n}
n is a non-negative integer. Matches exactly n times. For example, "o{2}" does not match the "o" in "Bob" but does match both "o"s in "food".
{n,}
n is a non-negative integer. Match at least n times. For example, "o{2,}" does not match the "o" in "Bob" but matches all o's in "foooood". "o{1,}" is equivalent to "o+". "o{0,}" is equivalent to "o*".
{n,m}
M and n are non-negative integers, where n <= m. Match at least n times and at most m times. For example, "o{1,3}" matches the first three o's in "fooooood". 'o{0,1}' is equivalent to 'o?'. Note: You cannot insert spaces between commas and numbers.
?
When this character is followed by any other qualifier (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is "non-greedy". The "non-greedy" pattern matches the shortest possible string that is searched for, while the default "greedy" pattern matches the longest possible string that is searched for. For example, in the string "oooo", "o+?" matches only a single "o", while "o+" matches all "o"s.
.
Matches any single character except "n". To match any character including "n", use a pattern such as "[sS]".
(pattern)
matches pattern and captures the matching subexpression. Captured matches can be retrieved from the resulting "matches" collection using the $0…$9 attribute. To match the bracket character ( ), use "(" or ")".
(?:pattern)
matches pattern but does not capture the subexpression of that match, i.e. it is a non-capturing match and does not store the match for later use. This is useful when combining pattern parts with the "or" character (|). For example, 'industr(?:y|ies) is a more economical expression than 'industry|industries'.
(?=pattern)
A subexpression that performs a forward lookahead search that matches a string at the start of a string matching pattern. It is a non-capturing match, i.e. a match that cannot be captured for later use. For example, 'Windows (?=95|98|NT|2000)' matches 'Windows' in 'Windows 2000', but not 'Windows' in 'Windows 3.1'. Prediction lookaheads do not occupy characters, that is, after a match occurs, the next match is searched immediately after the previous match, not after the characters that make up the prediction lookahead.
(?!pattern)
A subexpression that performs a backward lookahead search that matches a search string that is not at the start of a string matching pattern. It is a non-capturing match, i.e. a match that cannot be captured for later use. For example, 'Windows (?!95|98|NT|2000)' matches 'Windows' in 'Windows 3.1', but not 'Windows' in 'Windows 2000'. Prediction lookaheads do not occupy characters, that is, after a match occurs, the next match is searched immediately after the previous match, not after the characters that make up the prediction lookahead.
x|y
matches x or y. For example, 'z|food' matches "z" or "food". '(z|f)ood' matches "zood" or "food".
[xyz]
Character set. Matches any character contained in . For example, "[abc]" matches the "a" in "plain".
[^xyz]
Reverse character set. Matches any characters not included. For example, "[^abc]" matches the "p" in "plain".
[a-z]
Character range. Matches any character within the specified range. For example, "[a-z]" matches any lowercase letter in the range "a" through "z".
[^a-z]
Reverse range character. Matches any character not within the specified range. For example, "[^a-z]" matches any character that is not in the range "a" through "z".
b
matches a word boundary, that is, the position (including the starting and ending positions) between the word and a space (or punctuation mark - ASCII standard characters except letters and numbers can generally be understood as punctuation marks). For example, "erb" matches the "er" in "never" but not the "er" in "verb".
B
Non-word boundary matching. "erB" matches the "er" in "verb", but not the "er" in "never".
cx
matches the control character indicated by x. For example, cM matches Control-M or carriage return. The value of x must be between A-Z or a-z. If this is not the case, c is assumed to be the "c" character itself.
d
Number character matching. Equivalent to [0-9].
D
Non-numeric character matching. Equivalent to [^0-9].
f
Form break matching. Equivalent to x0c and cL.
n
Newline matching. Equivalent to x0a and cJ.
r
matches a carriage return character. Equivalent to x0d and cM.
s
Matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [ fnrtv].
S
matches any non-whitespace character. Equivalent to [^ fnrtv].
t
Tab matching. Equivalent to x09 and cI.
v
Vertical tab matching. Equivalent to x0b and cK.
w
Matches any type character, including underscore. Equivalent to "[A-Za-z0-9_]".
W
matches any non-word character. Equivalent to "[^A-Za-z0-9_]".
xn
matches n, where n is a hexadecimal escape code. The hexadecimal escape code must be exactly two digits long. For example, "x41" matches "A". "x041" is equivalent to "x04" & "1". Allow ASCII codes in regular expressions.
num
matches num, where num is a positive integer. Backreference to capture match. For example, "(.)1" matches two consecutive identical characters.
n
Identifies an octal escape code or backreference. If n is preceded by at least n capturing subexpressions, then n is a backreference. Otherwise, if n is an octal number (0-7), then n is an octal escape code.
nm
Identifies an octal escape code or backreference. If nm is preceded by at least nm capturing subexpressions, then nm is a backreference. If nm is preceded by at least n captures, then n is a backreference followed by the characters m. If neither of the previous conditions exists, then nm matches the octal value nm, where n and m are octal digits (0 -7).
nml
When n is an octal number (0-3), m and l are octal numbers (0-7), match the octal escape code nml.
un
matches n, where n is a Unicode character represented as a four-digit hexadecimal number. For example, u00A9 matches the copyright symbol (©).