还记得以前在工作中,将爬来的其它网站的数据导到xml。但是会遇到一个问题:即网页会有ascII的控制字符。一开始以为是别人为了防止采集而加入的,然后发现一个就往过滤表里加一个。直到慢慢发现,他们都是ascii表里的字符。找到原因了,就好解决了。
/**
* 根据ascii码过滤控制字符
* @param type $string
*/
public static function special_filter($string)
{
if(!$string) return '';
$new_string = '';
for($i =0; isset($string[$i]); $i++)
{
$asc_code = ord($string[$i]); //得到其asc码
//以下代码旨在过滤非法字符
if($asc_code == 9 $asc_code == 10 $asc_code == 13){
$new_string .= ' ';
}
else if($asc_code > 31 && $asc_code != 127){
$new_string .= $string[$i];
}
}
return trim($new_string);
}

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

Atom editor mac version download
The most popular open source editor