Home  >  Article  >  Backend Development  >  PHP regular expression efficiency greedy, non-greedy and backtracking analysis (recommended)

PHP regular expression efficiency greedy, non-greedy and backtracking analysis (recommended)

高洛峰
高洛峰Original
2017-01-09 10:16:421802browse

Let’s first understand what is greedy in regular expressions and what is non-greedy? Or what is matching priority quantifier and what is ignoring priority quantifier?

Okay, I don’t know what the concept is, let’s give an example.

A student wanted to filter the content between them. This is how he wrote the regular rules and procedures.

$str = preg_replace(&#39;%<script>.+?</script>%i&#39;,&#39;&#39;,$str);//非贪婪

It seems like there is nothing wrong with it, but in fact it is not. If

$str = &#39;<script<script>alert(document.cookie)</script>>alert(document.cookie)</script>&#39;;

then after the above program processing, the result is

$str = &#39;<script<script>alert(document.cookie)</script>>alert(document.cookie)</script>&#39;;
$str = preg_replace(&#39;%<script>.+?</script>%i&#39;,&#39;&#39;,$str);//非贪婪
print_r($str);
//$str 输出为 <script>alert(document.cookie)</script>

Still He couldn't achieve the effect he wanted. The above is non-greed, and some are called laziness. The sign of non-greedy is to add ? after the quantitative metacharacter, such as +?, *?, ?? (more special, I will write about it in the future BLOG), etc. That is, it means non-greedy. If you don’t write ?, it means greedy. For example

$str = &#39;<script<script>alert(document.cookie)</script>>alert(document.cookie)</script>&#39;;
$str = preg_replace('%<script>.+</script>%i','',$str);//非贪婪
print_r($str);
//$str 输出为