Home  >  Article  >  Web Front-end  >  Introduction to functions for filtering Html

Introduction to functions for filtering Html

高洛峰
高洛峰Original
2017-03-31 11:26:471297browse

This article introduces the function of filtering Html, which has certain reference value. Interested friends can refer to it

//过滤Html的函数
public string checkStr(string html)
{
    System.Text.RegularExpressions.Regex regex1 =
        new System.Text.RegularExpressions.Regex(@"<script[\s\S]+</script *>",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex2 =
        new System.Text.RegularExpressions.Regex(@" href *= *[\s\S]*script *:",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex3 =
        new System.Text.RegularExpressions.Regex(@" no[\s\S]*=",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex4 =
        new System.Text.RegularExpressions.Regex(@"<iframe[\s\S]+</iframe *>",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex5 =
        new System.Text.RegularExpressions.Regex(@"<frameset[\s\S]+</frameset *>",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex6 =
                new System.Text.RegularExpressions.Regex(@"\<img[^\>]+\>",
                    System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex7 =
        new System.Text.RegularExpressions.Regex(@"</p>",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex8 =
        new System.Text.RegularExpressions.Regex(@"<p>",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    System.Text.RegularExpressions.Regex regex9 =
        new System.Text.RegularExpressions.Regex(@"<[^>]*>",
            System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    html = regex1.Replace(html, ""); //过滤<script></script>标记 
    html = regex2.Replace(html, ""); //过滤href=javascript: (<A>) 属性 
    html = regex3.Replace(html, " _disibledevent="); //过滤其它控件的on...事件 
    html = regex4.Replace(html, ""); //过滤iframe
    html = regex5.Replace(html, ""); //过滤frameset 
    html = regex6.Replace(html, ""); //过滤frameset
    html = regex7.Replace(html, ""); //过滤frameset
    html = regex8.Replace(html, ""); //过滤frameset
    html = regex9.Replace(html, "");
    html = html.Replace(" ", "");
    html = html.Replace("</strong>", "");
    html = html.Replace("<strong>", "");
    return html;
}

The above is the detailed content of Introduction to functions for filtering Html. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn