Home >Backend Development >PHP Tutorial >Regular expression matching html and filtering illegal characters_PHP tutorial

Regular expression matching html and filtering illegal characters_PHP tutorial

WBOY
WBOYOriginal
2016-07-20 11:01:461193browse

Regular expression matching html to filter illegal characters
To match an html tag, the matching table is as follows:

[ss]*[ss]*

[ss]*?*?>
*?>

以上两个表达式,一个加了"?"和一个却不加"?",那么这有什么区别呢?
我们知道"?"在正则表达式里是一个通配符:匹配前面的子表达式零次或一次,或指明一个非贪婪限定符。

在这里,通过测试,我们得出这样的结论:在不加"?"的情况下,在匹配下面一段内容的时候:

 

这是第一个table
or
[ss]*?*?>
*?>
The above two expressions, one adds "?" and the other does not add "?", so what is the difference?
We know that "?" is a wildcard character in regular expressions: it matches the previous subexpression zero or once, or specifies a non-greedy qualifier.
Here, through testing, we come to this conclusion: without adding "?", when matching the following content:

This is the first table

I am not the content in the table

This is the second table

I am not the content in the table either

This is the third table





$str=preg_replace("/s+/", " ", $str); //Filter redundant carriage returns
$str=preg_replace("/<[ ]+/si","<",$str); //Filter <__("<" with a space after it)

$str=preg_replace("/

/si","",$str); //Comments
$str=preg_replace("/<(!.*?)>/si","",$str); //Filter doctype

$str=preg_replace("/<(/?html.*?)>/si","",$str); //Filter html tags

$str=preg_replace("/<(/?head.*?)>/si","",$str); //Filter head tag
$str=preg_replace("/<(/?meta.*?)>/si","",$str); //Filter meta tags

$str=preg_replace("/<(/?body.*?)>/si","",$str); //Filter body tag

$str=preg_replace("/<(/?link.*?)>/si","",$str); //Filter link tag
$str=preg_replace("/<(/?form.*?)>/si","",$str); //Filter form tag

$str=preg_replace("/cookie/si","cookie",$str); //Filter cookie tags


$str=preg_replace("/<(applet.*?)>(.*?)<(/applet.*?)>/si","",$str); //Filter applet tag

$str=preg_replace("/<(/?applet.*?)>/si","",$str); //Filter applet tags


$str=preg_replace("/<(style.*?)>(.*?)<(/style.*?)>/si","",$str); //Filter style tag

$str=preg_replace("/<(/?style.*?)>/si","",$str); //Filter style tag


$str=preg_replace("/<(title.*?)>(.*?)<(/title.*?)>/si","",$str); //Filter title tag

$str=preg_replace("/<(/?title.*?)>/si","",$str); //Filter title tag


$str=preg_replace("/<(object.*?)>(.*?)<(/object.*?)>/si","",$str); //Filter object tag
$str=preg_replace("/<(/?objec.*?)>/si","",$str); //Filter object tag

$str=preg_replace("/<(noframes.*?)>(.*?)<(/noframes.*?)>/si","",$str); //Filter noframes tag
$str=preg_replace("/<(/?noframes.*?)>/si","",$str); //Filter noframes tag

$str=preg_replace("/<(i?frame.*?)>(.*?)<(/i?frame.*?)>/si","",$str) ; //Filter frame tag

$str=preg_replace("/<(/?i?frame.*?)>/si","",$str); //Filter frame tag
$str=preg_replace("/<(script.*?)>(.*?)<(/script.*?)>/si","",$str); //Filter script tag

$str=preg_replace("/<(/?script.*?)>/si","",$str); //Filter script tags

$str=preg_replace("/Webpage Special Effects/si","javascript",$str); //Filter script tags $str=preg_replace("/on([a-z]+)s*=/si","on1=",$str); //Filter script tags $str=preg_replace("//si","",$str); //Filter script tags, such as javascript: alert('aabb) ?> http://www.bkjia.com/PHPjc/445418.html
www.bkjia.comtrue
http: //www.bkjia.com/PHPjc/445418.htmlTechArticleRegular expression matching html filters illegal characters to match an html tag. The matching table is as follows: [ss]* or [ss ]*? The above two expressions, one with ? and one without ?, so what does this mean...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn