Home  >  Article  >  Web Front-end  >  A concise summary of regular expressions in JavaScript_Basic knowledge

A concise summary of regular expressions in JavaScript_Basic knowledge

WBOY
WBOYOriginal
2016-05-16 16:53:091131browse

1. How to define regular expressions

There are two ways to define regular expressions: constructor definition and regular expression literal definition. For example:

Copy code The code is as follows:
var reg1 = new RegExp('d{5, 11} '); // Define
through constructor var reg2 = /d{5, 12}/; // Define
through direct quantity

Regular expression literal character
o: NUL character (u0000)
t: Tab character (u0009)
n: Line feed character (u000A)
v: Vertical tab character ( u000B)
f: Form feed character (u000C)
r: Carriage return character (u000D)
xnn: Latin character specified by the hexadecimal number nn. For example, x0A is equivalent to
uxxxx: Unicode character specified by hexadecimal number xxxx, for example, u0009 is equivalent to
🎜> ^: Matches the beginning of a string. In multi-line retrieval, matches the beginning of a line
$: Matches the end of a string. In multi-line retrieval, matches the end of a line
b: Matches a word The boundary, in short, is the position between the characters w and W, or the position between the character w and the beginning or end of the string ([b] matches the backspace character)
B: matches non- The position of the word boundary
(?=p): zero-width positive lookahead assertion, requiring the following characters to match p, but not including those characters that match p
(?!p): zero-width negative Assert to the lookahead, requiring that the following string does not match p
Regular expression character class
[...]: Any character within square brackets
[^...]: Not in square brackets Any character within brackets
.: Any character except newlines and other Unicode line terminators
w: Any word composed of ASCII characters, equivalent to [a-zA-Z0-9]
W: Any word that is not composed of ASCII characters, equivalent to [^a-zA-Z0-9]
s: Any Unicode whitespace character
S: Any non-Unicode whitespace character, pay attention to w and S Different
d: Any ASCII number, equivalent to [0-9]
D: Any character except ASCII digits, equivalent to [^0-9]
[b]: Backspace Direct quantity (special case)
Repeated character syntax of regular expression
{n, m}: Match the previous item at least n times, but not more than m times
{n, }: Match the previous item n times or more
{n}: Match the previous item n times
?: Match the previous item 0 or 1 times, which means the previous item is optional, equivalent to {0, 1}
: Matches the previous item 1 or more times, equivalent to {1, }
*: Matches the previous item 0 or more times, equivalent to {0, }
regular expression Selection, grouping and reference characters of expressions
|: Selection, matching the sub-expression on the left or the sub-expression on the right of the symbol
(...): Combination, combining several items into one unit, this Units can be modified by symbols such as "*", " ", "?" and "|", and the string matching this group can be remembered for any subsequent use
(?: ...): only Combination, combines items into one unit, but does not remember the characters that match the shuffling
n: Matches the first matching character of the nth group. The group is a subexpression in parentheses (it may also be Nested), the group index is the number of left brackets from left to right, grouping in the form of "(?:" is not encoded
Regular expression modifier
i: Perform case-insensitive matching
g: Perform a global match, in short, find all matches instead of stopping after finding the first one
m: Multi-line matching mode, ^ matches the beginning of a line and the beginning of a string, $ matches The end of the line and the end of the string
String method for pattern matching
search(): Its parameter is a regular expression, returning the starting position of the first matching substring. If If there is no matching substring, -1 is returned. If the parameter of search() is not a regular expression, it will first be converted to a regular expression through the RegExp constructor. search() does not support global retrieval because it ignores the modifier g. For example:


Copy code The code is as follows:

var s = "JavaScript".search(/script/i); // s = 4

replace(): It is used to perform retrieval and replacement. Receives two parameters, the first is the regular expression, and the second is the string to be replaced. If the modifier g is set in the regular expression, global replacement is performed, otherwise only the first matching substring is replaced. If the first argument is not a regular expression, the string is searched directly instead of being converted to a regular expression. For example:

Copy code The code is as follows:
var s = "JavaScript".replace(/java/gi , "Script"); // s = Script Script

Match(): Its parameter is a regular expression. If not, it is converted through RegExp and returns an array composed of matching results. If modifier g is set, a global match is performed. For example:

Copy code The code is as follows:
var d = '55 ff 33 hh 77 tt'.match (/d /g); // d = ["55", "33", "77"]

split(): This method is used to split the string that calls it into an array of substrings. The delimiter used is the parameter of split(), and its parameter can also be a regular expression. For example:

Copy code The code is as follows:
var d = '123,31,453,645'.split(', '); // d = ["123", "31", "453", "645"]
var d = '21, 123, 44, 64, 67, 3'.split(/s*, s*/); // d = ["21", "123", "44", "64", "67", "3"]

2. RegExp object
Each RegExp object has 5 attributes. The source attribute is a read-only string containing the text of the regular expression. The global attribute is a read-only Boolean value that indicates whether this regular expression has the modifier g. The attribute ignoreCase is a read-only Boolean value that indicates whether this regular expression has the modifier i. The multiline attribute is a read-only Boolean value that indicates whether this regular expression has the modifier m. The lastIndex attribute is a readable and writable integer. If the matching pattern has the g modifier, this attribute stores the starting position of the next search in the entire string.
The RegExp object has two methods. The parameter of exec() is a string, and its function is similar to match(). The exec() method executes a regular expression on a specified string, that is, performs a matching search in a string. If no match is found, null is returned. If a match is found, an array is returned. The first element of this array contains the string matching the regular expression, and the remaining elements are the subexpressions in parentheses. The matched substring, regardless of whether the regular expression has modifier g, will return the same array. When the regular expression object calling exec() has modifier g, it will set the lastIndex property of the current regular expression object to the character position immediately next to the matched substring. When exec() is called a second time with the same regular expression, it will start retrieving from the string indicated by the lastIndex attribute. If exec() does not find any matching results, it will reset lastIndex to 0. For example:

Copy code The code is as follows:
var p = /Java/g;
var text = "JavaScript is more fun than Java!"
var r;
while((r = p.exec(text)) != null) {
     console.log(r, 'lastIndex: ' p .lastIndex);
}

Another method is test(). Its parameter is a string. Use test() to check a certain string. If it contains a matching result of the regular expression, it will return true otherwise it will return false. For example:

Copy code The code is as follows:
var p = /java/i;
p. test('javascript'); // true
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn