这是我的正则。
\<body\>([\s\S].*?)\<\/body\>
str是我要查找的字符串。假如我去掉字符串里面的换行,正则可以匹配到东西,但是如果不加这个代码,正则就匹配不到。
str = str.replace(/\n/g, "");
谁能解释一下?如何解决这个问题?
----------补充-----------
后来换成
\<body\>([\s\S]*?)\<\/body\>
这样就行了。
.*?和*?的区别在哪呢?
PHP中文网2017-04-17 15:36:52
Understand that you want to get all the content in the body tag
Regular expression below
/\<body\>([\s\S].*?)\<\/body\>/
The reason why it cannot match correctly is because it was written incorrectly.
Break down the key parts of this expression
([\s\S].*?)
[sS] matches a whitespace or non-whitespace character. In other words, it can match all characters, including newlines, spaces and tabs, but can only match one
.*? What does it mean?
. Indicates matching any character
except newline character.* means matching 0 or more arbitrary characters (excluding newlines), always matching as many characters as possible.
Here? is used to modify *
. Added together *?
means lazy matching. What does it mean? Just match as few characters as possible. Which of 0 or more is the least? Of course there are 0, so .*?
doesn't match anything.
Entire expression
<body>([\s\S].*?)<\/body> // 注意 < 和 > 是不需要转义的
matches content that contains only any one character or whitespace between <body>
and </body>
. and
<body>([\s\S])<\/body>
The matching content of is the same, which means .*?
has no effect.
Why is it OK to just remove .
? Because after removing .
, lazy matching of *?
is used to modify
[\s\S]
part, indicating 0 or more whitespace or non-whitespace characters.
I think you are
[\s\S]
is understood to be used to match newlines. Adding .
can match all content. In fact, according to your understanding, it should be written like this
<body>([\s\S.]*?)<\/body>
can also be matched in this way, but the .
here is redundant because
[\s\S]
matches any content, including the content matched by .
.
So the final answer is
<body>([\s\S]*?)<\/body>
matches 0 or more characters between <body>
and </body>
. So the content can be matched correctly.
That’s it.
PS: The layout is a bit messy, because escape characters are difficult to use in the SegmentFault editor