Home  >  Article  >  Backend Development  >  Regular Expression Syntax_PHP Tutorial

Regular Expression Syntax_PHP Tutorial

WBOY
WBOYOriginal
2016-07-21 16:08:54855browse


Regular expression syntax
A regular expression is a text pattern composed of ordinary characters (such as characters a to z) and special characters (called metacharacters). The pattern describes one or more strings to be matched when searching for text bodies. Regular expressions serve as a template that matches a character pattern with a searched string.

Here are some examples of regular expressions you may encounter:

JScript VBScript matches
/^[ t]*$/ "^[ t]*$" matches a blank OK.
/d{2}-d{5}/ "d{2}-d{5}" Verifies whether an ID number consists of a 2-digit number, a hyphen, and a 5-digit number.
/<(.*)>.*/ "<(.*)>.*" matches an HTML tag.

The following table is a complete list of metacharacters and their behavior in the context of regular expressions:

Character Description
Marks the next character as a special character, or a primitive escaping character, or a backreference, or an octal escape character. For example, 'n' matches the character "n". 'n' matches a newline character. The sequence '' matches "" and "(" matches "(".
^ matches the beginning of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after 'n' or 'r' .
$ matches the end of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before 'n' or 'r'.
* matches the preceding subexpression zero or more times. times. For example, zo* matches "z" and "zoo". * is equivalent to {0,}. For example, 'zo+' matches "zo". and "zoo", but not "z". + is equivalent to {1,}.
? matches the preceding subexpression zero or once. For example, "do(es)?" matches "do". or "do" in "does". ? is equivalent to {0,1}.
{n} n matches a certain number of times. For example, 'o{2}' cannot match " 'o' in "Bob", but can match two o's in "food".
{n,} n is a non-negative integer. Matches at least n times. For example, 'o{2,}' cannot match 'o' in "Bob", but matches all o's in "foooood". 'o{1,}' is equivalent to 'o+'. 'o{0,}' is equivalent to 'o*'.
{n,m} m and n are both non-negative integers, where n <= m. Match at least n times and at most m times. For example, "o{1,3}" will match "fooooood". The first three o's are equivalent to 'o?'. Please note that there cannot be a space between the comma and the two numbers when this character is followed by any other limit. When followed by the symbol (*, +, ?, {n}, {n,}, {n,m}), the matching mode is non-greedy. The non-greedy mode matches the searched string as little as possible, while the default greedy The pattern matches as much of the searched string as possible. For example, for the string "oooo", 'o+?' will match a single "o", while 'o+' will match all 'o's. Any single character except "n". To match any character including 'n', use a pattern like '[.n]' to match pattern and get the match. The matches can be obtained from the generated Matches collection, using the SubMatches collection in VBScript and the $0...$9 attributes in JScript. To match parentheses characters, use '(' or ')' .
(?attern) matches pattern but does not get the matching result, which means it is a non-getting match and is not stored for later use. This is useful when using the "or" character (|) to combine parts of a pattern. For example, 'industr(?:y|ies) is a shorter expression than 'industry|industries'.
(?=pattern) Forward lookup, match the search string at the beginning of any string matching pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, 'Windows (?=95|98|NT|2000)' matches "Windows" in "Windows 2000" , but not "Windows" in "Windows 3.1" . Prefetching does not consume characters, that is, after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the prefetch.
(?!pattern) Negative lookup, matches the search string at the beginning of any string that does not match pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, 'Windows (?!95|98|NT|2000)' can match "Windows" in "Windows 3.1", but not "Windows" in "Windows 2000". Prefetching does not consume characters, that is, after a match occurs, the search for the next match starts immediately after the last match, rather than starting after the character containing the prefetch
x|y matches x or y. For example, 'z|food' matches "z" or "food". '(z|f)ood' matches "zood" or "food".
[xyz] character set. Matches any one of the characters contained. For example, '[abc]' matches 'a' in "plain".
[^xyz] Negative value character set. Matches any character not included. For example, '[^abc]' matches the 'p' in "plain".
[a-z] character range. Matches any character within the specified range. For example, '[a-z]' matches any lowercase alphabetic character in the range 'a' to 'z' .
[^a-z] Negative character range. Matches any character not within the specified range. For example, '[^a-z]' matches any character that is not in the range 'a' to 'z' .
b matches a word boundary, which refers to the position between a word and a space. For example, 'erb' matches 'er' in "never" but not in "verb" .
B matches non-word boundaries. 'erB' matches 'er' in "verb", but not in "never".
cx matches the control character specified by x. For example, cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, treat c as a literal 'c' character.
d matches a numeric character. Equivalent to [0-9].
D matches a non-numeric character. Equivalent to [^0-9].
f matches a form feed. Equivalent to x0c and cL.
n matches a newline character. Equivalent to x0a and cJ.
r matches a carriage return character. Equivalent to x0d and cM.
s matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [fnrtv].
S matches any non-whitespace character. Equivalent to [^ fnrtv].
t matches a tab character. Equivalent to x09 and cI.
v matches a vertical tab character. Equivalent to x0b and cK.
w matches any word character including an underscore. Equivalent to '[A-Za-z0-9_]'.
W matches any non-word character. Equivalent to '[^A-Za-z0-9_]'.
xn matches n, where n is the hexadecimal escape value. The hexadecimal escape value must be exactly two digits long. For example, 'x41' matches "A". 'x041' is equivalent to 'x04' & "1". ASCII encoding can be used in regular expressions. .
num matches num, where num is a positive integer. A reference to the match obtained. For example, '(.)1' matches two consecutive identical characters.
n identifies an octal escape value or a backreference. If n is preceded by at least n fetched subexpressions, n is a backward reference. Otherwise, if n is an octal number (0-7), then n is an octal escape value.
nm identifies an octal escape value or a backreference. If nm is preceded by at least nm get subexpressions, nm is a backward reference. If nm is preceded by at least n obtains, then n is a backward reference followed by the literal m . If none of the previous conditions are met, and if n and m are both octal numbers (0-7), nm will match the octal escape value nm.
nml If n is an octal number (0-3), and m and l are both octal numbers (0-7), then match the octal escape value nml.
un matches n, where n is a Unicode character represented by four hexadecimal digits. For example, u00A9 matches the copyright symbol (?).

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/314710.htmlTechArticleRegular expression syntax A regular expression consists of ordinary characters (such as characters a to z) and special characters (called as metacharacters). This mode describes how to search for text...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn