Home > Article > Backend Development > PHP method to solve DOM garbled code
I recently encountered a problem at work. When using DOM, I discovered the problem of garbled characters. Later, I finally solved it by searching for information on the Internet. Now I will share the solution with everyone. Interested friends can For reference, friends in need can come and learn together.
Preface
DOM is a relatively new xml and html processing class in PHP. It can operate the DOM tree as conveniently as javascript. More on the Internet The purpose of this article is to introduce its processing of XML. Today’s article will introduce PHP’s method of solving DOM garbled characters. I won’t go into details below and just look at the solution below.
The solution is as follows
/** * 请求url页面信息 * @param str $url * @return str mixed|boolean */ function curl_get($url) { $curl = curl_init(); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); //302跳转 curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0'); curl_setopt($curl, CURLOPT_REFERER, $url); $data = curl_exec($curl); $code = curl_getinfo($curl,CURLINFO_HTTP_CODE); //输出请求状态码 curl_close($curl); if(200 == $code) { //解决乱码 if (preg_match('#<meta[^>]*charset="?gb2312"[^>]*>#', $data)) { $data = iconv("gb2312","utf-8//IGNORE",$data); $data = preg_replace('#<meta[^>]*charset="?gb2312"[^>]*>#is', '<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">', $data); } if (!preg_match('#<meta charset="utf-8"[^>]*>#is', $data)) { $data = str_replace('<head>', '<head><meta http-equiv="Content-Type" content="text/html;charset=UTF-8">', $data); } if (preg_match('#<meta charset="utf-8"[^>]*>#is', $data)) { $data = preg_replace('#<meta charset="utf-8"[^>]*>#is', '<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">', $data); } return $data; } else { return false; } }
/** * 获取 DOMDocument 对象 * @param str $url * @return boolean|DOM */ function getDom($url) { $html_content = curl_get($url); if(empty($html_content)) { //saveLog($url, '请求失败'); return false; } $dom = new DOMDocument('1.0', 'utf-8'); libxml_use_internal_errors(true); $dom->loadHTML($html_content); return $dom; }
$html_content = mb_convert_encoding($html_content, 'UTF-8', 'gb2312');
The above is the entire content of this article, I hope it will be helpful to everyone's study.
Related recommendations:
phpDetailed explanation of avatar upload preview example
phpDetailed explanation of avatar upload preview example
Detailed graphic explanation of PHP serialization and deserialization functions
The above is the detailed content of PHP method to solve DOM garbled code. For more information, please follow other related articles on the PHP Chinese website!