Home  >  Article  >  Backend Development  >  Parse PHP and use regular expressions to solve the problem of collecting content layout_PHP tutorial

Parse PHP and use regular expressions to solve the problem of collecting content layout_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:04:57932browse

A common problem encountered when doing collection is content layout. I spent some time writing a function that replaces HTML tags and styles with regular expressions. I will share it with you.

Copy code The code is as follows:

/**
* Format content
* @param string $content It is best to use utf-8 encoding for the content
* @return string
*! This function needs to enable tidy extension
*/
function removeFormat($content) {
$replaces = array (
"//i" => '',
"//i" => '',
"//i" => '',
"/
/i" => '',
"// i" => '',
"//i" => '',
"//i" => " "/
/i" => "

",
"// i"=>'',
/* "//i" => '',//Do not enable it when encountering table content
"/< /table>/i" => '',
"//i" => '',
"//i" => '',
"//i" => '

',
"//i" => ' "//i" => '', */
"/style=.+?['|"]/i" => '' ,
"/class=.+?['|"]/i" => '',
"/id=.+?['|"]/i"=>'',
"/lang=.+?['|"]/i"=>'',
//"/width=.+?['|"]/i"=>'',/ /It’s hard to control and comment out
//"/height=.+?['|"]/i"=>'',
"/border=.+?['|"]/i" =>'',
"/face=.+?['|"]/i"=>'',
"/[ ]*/i" = > "

",
"/.*/i" => '',
"/ /i " => ' ',//Replace spaces with
"/[ |x{3000}|rn]*/ui" => '

',// Replace half-width and full-width spaces and line breaks, and use to eliminate encoding problems that occur when writing to the database

);
$config = array(
//'indent' => TRUE, // Whether to indent
'output-html' => TRUE,//Whether it is output xhtml
'show-body-only'=>TRUE,//Whether only the body is obtained
'wrap' => 0
);
$content = tidy_repair_string($content, $config, 'utf8');//First use the tidy class library that comes with php to repair the html tags, otherwise various problems will easily occur when replacing them. A weird situation
$content = trim($content);
foreach ( $replaces as $k => $v ) {
$content = preg_replace ( $k, $v, $content ) ;
}

if(strpos($content,'

')>6)//Some content may be missing the

tag at the beginning
$content = '< p> '.$content;

$content = tidy_repair_string($content, $config, 'utf8');//Repair it again to remove the html empty tags
$content = trim($content );
return $content;
}


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/327743.htmlTechArticleA common problem encountered when doing collection is content layout. It took me some time to write a regular expression to replace html tags. and style functions, share them. Copy the code. The code is as follows: /** * Formatting...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Previous article:Analyze how to use php screw to encrypt php source code_PHP tutorialNext article:Analyze how to use php screw to encrypt php source code_PHP tutorial

Related articles

See more