Home  >  Article  >  Backend Development  >  Explanation of all symbols in regular expressions

Explanation of all symbols in regular expressions

巴扎黑
巴扎黑Original
2017-04-20 18:02:153058browse

There have been many regular expression symbols before. This article will give you a summary of all commonly used symbols.

All symbol explanations

Character Description

\ Mark the next character as a special character, a literal character, or a backward character quote, or an octal escape character. For example, 'n' matches the character "n". '\n' matches a newline character. The sequence '\\' matches "\" and "\(" matches "(".

^ Matches the beginning of the input string. If the Multiline property of the RegExp object is set, ^ also matches '\n ' or '\r'.

$ Matches the end of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before '\n' or '\r'.

* Matches the previous subexpression zero or more times. For example, zo* can match "z" and "zoo". * Equivalent to {0,}+. The preceding subexpression one or more times. For example, 'zo+' can match "zo" but not "z". + is equivalent to {1,}? Matches the preceding one. subexpression zero or one time. For example, "do(es)?" can match "do" in "do" or "does". ? is equivalent to {0,1}. n} n is a non-negative integer. Matches a certain number of times. For example, 'o{2}' cannot match the 'o' in "Bob", but it can match two o's in "food"

#. ##{n,} n is a non-negative integer. Match at least n times. For example, 'o{2,}' cannot match 'o' in "Bob", but it can match all o's in "foooood". o{1,}' is equivalent to 'o+'. 'o{0,}' is equivalent to 'o*'.

{n,m} m and n are both non-negative integers, where n ? When this character is followed by any other limiter (*, +, ?, {n}, {n, }, {n,m}), the matching mode is non-greedy. The non-greedy mode matches as little of the searched string as possible, while the default greedy mode matches as much of the searched string as possible. For example, For the string "oooo", 'o+?' will match a single 'o', while 'o+' will match all 'o's.

. Matches any single character except "\n". To match any character including '\n', use a pattern like '[.\n]'

(pattern) Match pattern and get this match. The matches obtained can be obtained from the generated Matches collection, using the SubMatches collection in VBScript or the $0…$9 properties in JScript. To match parentheses characters, use ‘\(’ or ‘\)’.

(?:pattern) matches pattern but does not obtain the matching result, which means that this is a non-acquisition match and is not stored for later use. This is useful when using the "or" character (|) to combine parts of a pattern. For example, ‘industr(?:y|ies) is a shorter expression than ‘industry|industries’.

(?=pattern) Forward lookup, match the search string at the beginning of any string that matches pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, 'Windows (?=95|98|NT|2000)' matches "Windows" in "Windows 2000" but not "Windows" in "Windows 3.1". Prefetching does not consume characters, that is, after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the prefetch.

(?!pattern) Negative lookup, match the search string at the beginning of any string that does not match pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, 'Windows (?!95|98|NT|2000)' can match "Windows" in "Windows 3.1", but not "Windows" in "Windows 2000". Prequery does not consume characters, that is, after a match occurs, the search for the next match starts immediately after the last match, rather than starting after the characters containing the prequery

x|y Matches x or y. For example, 'z|food' matches "z" or "food". ’(z|f)ood’ matches “zood” or “food”.

[xyz] Character set. Matches any one of the characters contained. For example, ‘[abc]’ matches ‘a’ in ‘plain’.

[^xyz] Negative value character set. Matches any character not included. For example, '[^abc]' would match 'p' in "plain".

[a-z] Character range. Matches any character within the specified range. For example, '[a-z]' matches any lowercase alphabetic character in the range 'a' to 'z'.

[^a-z] Negative character range. Matches any character not within the specified range. For example, '[^a-z]' matches any character that is not in the range 'a' to 'z'.

\b Matches a word boundary, which refers to the position between a word and a space. For example, ‘er\b’ matches the ‘er’ in “never” but not the “er” in “verb”.

\B Matches non-word boundaries. 'er\B' matches the 'er' in 'verb', but not the 'er' in 'never'.

\cx Matches the control character specified by x. For example, \cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, c is treated as a literal 'c' character.

\d Matches a numeric character. Equivalent to [0-9].

\D Matches a non-numeric character. Equivalent to [^0-9].

\f Matches a form feed character. Equivalent to \x0c and \cL.

\n Matches a newline character. Equivalent to \x0a and \cJ.

\r Matches a carriage return character. Equivalent to \x0d and \cM.

\s Matches any whitespace characters, including spaces, tabs, form feeds, etc. Equivalent to [ \f\n\r\t\v].

\S Matches any non-whitespace characters. Equivalent to [^ \f\n\r\t\v].

\t Matches a tab character. Equivalent to \x09 and \cI.

\v Matches a vertical tab character. Equivalent to \x0b and \cK.

\w Matches any word character including an underscore. Equivalent to '[A-Za-z0-9_]'.

\W Matches any non-word characters. Equivalent to ‘[^A-Za-z0-9_]’.

\xn Matches n, where n is the hexadecimal escape value. The hexadecimal escape value must be exactly two digits long. For example, '\x41' matches "A". ’\x041′ is equivalent to ‘\x04′ & “1″. ASCII encoding can be used in regular expressions. .

\num Matches num, where num is a positive integer. A reference to the match obtained. For example, '(.)\1' matches two consecutive identical characters.

\n Identifies an octal escape value or a backward reference. If \n is preceded by at least n fetched subexpressions, n is a backward reference. Otherwise, if n is an octal number (0-7), then n is an octal escape value.

\nm Identifies an octal escape value or a backreference. If \nm is preceded by at least nm get-subexpressions, nm is a backward reference. If \nm is preceded by at least n obtains, n is a backward reference followed by a literal m. If neither of the previous conditions is true, then \nm will match the octal escape value nm if n and m are both octal digits (0-7).

\nml If n is an octal digit (0-3), and m and l are both octal digits (0-7), then the octal escape value nml is matched.

\un Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches the copyright symbol (?).

The above is the detailed content of Explanation of all symbols in regular expressions. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn