Home > Article > Backend Development > Summary of regular expression characters
Basic regular expression
Matches the writing of a single number, which can be "[0-9]
" or " \d
”.
matches a single non-numeric character , then use uppercase "\D
".
Matches any and of the 26 letters, use "[a-zA-Z]
"
Matches any one character, use the period If ".
"
matches specific characters, just write it directly. For example, "abcd
" matches itself. If you encounter special characters, you need to escape , and the escape character is "\</code>".
matches a character and the use of square brackets is called "character set". Square brackets are used to specify a "set", matching a character in this set, such as the hexadecimal number "[0-9a-fA-F]
". The dot in the character set represents the dot itself , but other special characters still need to be transferred, such as the backslash character.
If you want to express the repetition of a rule, you need to use quantifiers. Use curly braces to indicate the number of repetitions. For example, 8 numbers can be expressed like this: "\d{8}
"
The quantifiers in the curly brackets can be changed. For example, if 7 to 8 numbers are expressed, it is expressed as " \d{7,8}
". The rvalue representing the upper limit does not need to be written. For example, "{0,}
" is legal, indicating that it is greater than or equal to 0 characters; but "{,10}
" is trying to express the upper limit alone. ” is illegal and should at least be written as “{0,10}
”.
The plus sign "+
" indicates that the number of elements to its left is "one or more", which is equal to the effect of "{1,}
". So the plus sign is also a special character.
The asterisk "*
" means that the number of elements to its left is "zero or at least one", that is, "{0,}
".
The question mark "?
" means "zero or one", which is equivalent to "{0,1}
".
The above items such as +
and *
will use the "greedy" pattern when matching. That is, match as many numbers as possible. For example, if you use "5+
" to match the string "55555", it will match the longest string it can find, which is "55555".
If you add a question mark after the quantifier, the matching pattern will become "lazy", which is the one with the least matching. For example, if you use "5+?
" to match, you will only find the smallest matching character "5".
The following are available lazy matching expressions: +?
, *?
, {n,}?
, {m,n}?
You can "capture" part of the expression and reference it later as a macro. Use brackets to define (capture), and then use "\1
" after the definition for reference; if it is the second capture, use "\2
", and so on.
Groups are generally saved, but when the expression is very long, it may be necessary to explicitly indicate not to save the group. For example, if you use the format "(?:THE|The|the)
", you use the "?:
" label to indicate that no naming tags are required.
Use "|
" to link two fields to provide "OR" logic. Note the use of
If the character "^
" is used in the set "[...]
", It means "not", for example, "[^0-9]
" is equivalent to "\D
".
The following is a list of commonly used single character matches:
Reference type | Pattern | Remarks |
---|---|---|
\d
|
||
\w |
Equivalent to " | [_a-zA-Z0-9]"
| ##non-digit
\D |
|
|
\W |
|
##Tab characterTab |
|
##Null character |
|
| ##Backspace||
##Space |
\s |
|
[ \t\n\r] | ”
##Return |
\r |
Line break |
\n |
|
##Space between words
| \b||
Any character | . |
The line terminator cannot be matched using this symbol |
The above is the detailed content of Summary of regular expression characters. For more information, please follow other related articles on the PHP Chinese website!