Home >Web Front-end >JS Tutorial >JavaScript single line mode_regular expression

JavaScript single line mode_regular expression

微波
微波Original
2017-06-28 13:39:131250browse

This article mainly introduces the regular expressions of JavaScript which also have single-line mode. Friends who need it can refer to

Regular expressions was first written by Ken Thompson in Implemented in his improved QED editor in 1970, the simplest metacharacter "." in the regular pattern matched any character except line breaks at that time:

"." is a regular expression which matches any character except 5563c1593a3ac6eb1677af49676ec1ed.

The above sentence comes from the official document of QED in 1970, which may be the first regular document in history.

Why is this stipulated? This is because QED edits files in line units, and the newline character at the end of the line is also included in the content of this line. For example, if you want to delete all single-line comments in a piece of code, you can use the following command in QED:

1,$s#//.*##

If "." can match the newline character, then the newline character will also be deleted, and it will Causes these lines to be merged with the next line, which is usually not what we want. Therefore, "." was designed not to match newlines when it was originally invented. Although there is no QED command on the current operating system for us to test, we still have VIM, and the "." in VIM cannot match the newline character for the same reason.

Unlike in Node, reading filesusually reads the entire file in one go, Perl inherits the tradition of many Linux commands reading files line by line, like this:

while (a8093152e673feb7aba1828c43532094) {print $_}
There is also a newline character at the end of

_, so Perl naturally inherits QED's rule that "." does not match newline characters. But Perl is a programming language after all, not an editor. The objects that its regular expressions need to match are not only single lines of text, but also multi-line text. Therefore, in its regular expressions, "." There is a need for cross-line matching, so Perl invented the regular single-line mode /s, which allows "." to also match newline characters. The official description of the /s modifier in Perl used to turn on single line mode is "Treat the string as single line". This "single line" should be understood like this: "." can only match in normal mode. Inline characters cannot span lines; in single-line mode, Perl will pretend to treat multi-line strings as one line, and treat the newline characters as inline characters, so "." can match them. To put it more vividly, the following three lines of text

1
2
3

are regarded as "1\n2\n3\n" one line of text. This is what the single-line mode means.

But the terrible thing is that for the same reason (string variables can contain multiple lines of text), Perl also invented the /m modifier, which is multi-line mode. The official description is "Treat the string as multiple lines ", this pattern has been included in the regular JavaScript rules since ancient times. The "multiple lines" here means: ^ and $ metacharacters will not match the positions before and after the newline characters in the middle of a string by default, that is, the string is always considered to be only one line. , you can match after turning on multi-line mode.

In other words, single-line mode and multi-line mode are for different metacharacters. People who are new to regular expressions will be confused by the two seemingly corresponding "single-line mode" and "multi-line mode". concept, but in fact, it is confusing with unrelated terms.

Later, the author of Ruby may have felt that the regular term "single-line mode" was not used well, so he called the pattern of "." matching newlines "multi-line mode", that is, let . * and other regular expressions can match multiple lines, so it makes perfect sense. The modifier also uses /m (Ruby will enable the "multiline mode" in Perl by default, so /m is not occupied). This is really To add insult to injury, it’s even more chaotic.

Later, the Python author may also feel that the term "single-line mode" should be avoided, so he gave a new name "dotall", which means that dot can match all characters. It is a good name. , and later Java also used this name.

The above has reviewed the history, explained the origin of the single-line mode, and explained that the name of the single-line mode was not chosen well. V8 has recently implemented a stage 3 ES proposal

https://

github.com/mathiasbynens/es-regexp-dotall-flag. This proposal introduces the /s modifier and dotAll attribute for JavaScript regularization. , the dotAll attribute is learned from Python and Java, and the /s modifier is inherited from Perl. There is no need to invent a new modifier such as /d here, which will only make things more complicated. The specific effect of /s in JavaScript is to allow "." to match four line terminators that could not be matched before: \n (line feed), \r (carriage return), \u2028 (line separator), \u2029 (paragraph separator) symbol):

/foo/s.dotAll // true
/^.{4}$/s.test("\n\r\u2028\u2029") // true
is actually a very simple thing, but some students who have not been exposed to regular expressions other than JavaScript may be confused when they learn this new mode. Here is a clarification: multiple lines The mode controls the performance of ^ and $, and the single-line mode controls the performance of ".". There is no direct relationship between the two.

However, the Perl language, which originally introduced the confusing concepts of single-line mode and multi-line mode, has completely deleted these two modes in Perl 6: "." matches newline characters by default, and \N can match newline characters. Any character except the character; ^ and $ always match the beginning and end of the string, and two new metacharacters, ^^ and $$, are introduced to match the beginning and end of the line.

The replacements for single-line mode [^] or [\s\S] that we used in the past are not completely useless. For example, in some editors that use JavaScript regularity (VS Code, Atom), no It is very possible to provide you with an interface to enable single-line mode. However, talking about the regular function in the editor, the regular function of the editor implemented in JavaScript is still too weak. For example, certain modes cannot be turned on within the regular code itself. For example, if it is in Sublime (using Python regular code), inside the regular code Use (?s) to enable dotall mode. For example, you can use (?s)/\*.+?\*/ to match all multi-line comments.

The above is the detailed content of JavaScript single line mode_regular expression. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn