search
Homephp教程php手册PHP将HTML转换成纯文本内容实例

把html转换成纯文本我们可以使用很多方法,不过最简单的就是使用strip_tags函数,但是还有一些朋友会发现可以使用自定义函数过滤掉,下面整理了一些方法.

将HTML转换为纯文本:有时候可能需要将HTML文本转换为纯文本,可以使用strip_tags()函数达到这个目的,该函数删除字符串中的所有HTML和PHP标记,只剩下文本实体,其形式为:

string strip_tags(string str[,string allowable_tags])

可选的参数allowable_tags指定在此过程中可以跳过的标记,下面的例子使用了strip_tags()删除字符串中的所以HTML标记,代码如下:

$input = "Email example@example.com"; 
echo strip_tags($input); 
这回返回以下结果:Email example@example.com
下面的例子删除<a>标记之外的所有标记:
$input = "This example 
is yanshare!"; 
echo strip_tags($input, ""); 
//返回结果如下: 
This example 
is yanshare! 
PHP版将html中的<br />换行符转换为文本框中的换行符,代码如下:
function br2nl($text){ 
    return preg_replace(&#39;/<br\\s*?\/??>/i&#39;,&#39;&#39;,$text); 
} 
//或者: 
function br2nl($text){ 
    $text=preg_replace(&#39;/<br\\s*?\/??>/i&#39;,chr(13),$text); 
    return preg_replace(&#39;/ /i&#39;,&#39; &#39;,$text); 
}

代码如下:

<?php 
// $document 应包含一个 HTML 文档。 
// 本例将去掉 HTML 标记,javascript 代码 
// 和空白字符。还会将一些通用的 
// HTML 实体转换成相应的文本。 
 
$search = array ("&#39;<script[^>]*.*?</script>&#39;si", // 去掉 javascript 
"&#39;<[/!]*?[^<>]*&#39;si", // 去掉 HTML 标记 
"&#39;([rn])[s]+&#39;", // 去掉空白字符 
"&#39;&(quot|#34);&#39;i", // 替换 HTML 实体 
"&#39;&(amp|#38);&#39;i", 
"&#39;&(lt|#60);&#39;i", 
"&#39;&(gt|#62);&#39;i", 
"&#39;&(nbsp|#160);&#39;i", 
"&#39;&(iexcl|#161);&#39;i", 
"&#39;&(cent|#162);&#39;i", 
"&#39;&(pound|#163);&#39;i", 
"&#39;&(copy|#169);&#39;i", 
"&#39;&#(d+);&#39;e"); // 作为 PHP 代码运行 
$replace = array ("", 
"", 
"1", 
"\"", 
"&", 
"<", 
">", 
" ",

chr(161), 

chr(162), 

chr(163), 

chr(169), 

"chr(1)");   

$text = preg_replace ($search, $replace, $document);

 

<?php
$mystr = << < SATO此处省略几十行HTML代码 ^ _ ^ SATO;
$str = strip_tags($mystr);

//到这里就已经达到我的HTML转为TXT文本的目的了,哈哈,使用这个函数真方便

//下面是插件的一些切词等操作,这里就不多说了

后来我从网上看到了一个使用PHP写的方法, 使用这个方法也可以实现将HTML转为TXT文本, 个人觉得也还蛮实用的, 在这里分享一下, 代码如下:

function HtmlToText($str) {
    $str = preg_replace("/<sty(.*)\/style>|<scr(.*)\/script>|<!--(.*)-->/isU", "", $str); //去除CSS样式、JS脚本、HTML注释
    $alltext = ""; //用于保存TXT文本的变量
    $start = 1; //用于检测<左、>右标签的控制开关
    for ($i = 0; $i < strlen($str); $i++) { //遍历经过处理后的字符串中的每一个字符
        if (($start == 0) && ($str[$i] == ">")) { //如果检测到>右标签,则使用$start=1;开启截取功能
            $start = 1;
        } else if ($start == 1) { //截取功能
            if ($str[$i] == "<") { //如果字符是<左标签,则使用<font color=&#39;red&#39;>|</font>替换
                $start = 0;
                $alltext.= "<font color=&#39;red&#39;>|</font>";
            } else if (ord($str[$i]) > 31) { //如果字符是ASCII大于31的有效字符,则将字符添加到$alltext变量中
                $alltext.= $str[$i];
            }
        }
    }
    //下方是去除空格和一些特殊字符的操作 
    $alltext = str_replace(" "," ",$alltext); 
    $alltext = preg_replace("/&([^;&]*)(;|&)/","",$alltext); 
    $alltext = preg_replace("/[ ]+/s"," ",$alltext); 
    return $alltext; 
}

使用上面这个方法也可以实现将简答的HTML代码转换为TXT文本.


本文链接:

收藏随意^^请保留教程地址.

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.