]*charset="?gb2312"[^>]*>#', $data)" method can solve the garbled characters."/> ]*charset="?gb2312"[^>]*>#', $data)" method can solve the garbled characters.">

Home  >  Article  >  Backend Development  >  How to solve the php dom garbled problem

How to solve the php dom garbled problem

藏色散人
藏色散人Original
2020-08-05 09:41:051366browse

Solution to php dom garbled code: first define a "curl_get" method to request url page information; then use "preg_match('#]*charset="?gb2312"[^>] *>#', $data)" method can solve the garbled characters. [^>

How to solve the php dom garbled problem

Recommendation: "PHP Video Tutorial"

DOM is a relatively new xml and html processing in PHP Class, you can operate the DOM tree as conveniently as JavaScript. More information on the Internet is about how it handles XML. Today’s article will introduce PHP’s method of solving DOM garbled characters. Not much to say below, just look at the solution below. .

The solution is as follows

/**
 * 请求url页面信息
 * @param str $url
 * @return str mixed|boolean
 */
function curl_get($url) {
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  //302跳转
  curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
  curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0');
  curl_setopt($curl, CURLOPT_REFERER, $url);
  $data = curl_exec($curl);
  $code = curl_getinfo($curl,CURLINFO_HTTP_CODE); //输出请求状态码
  curl_close($curl);
  if(200 == $code) {
    //解决乱码
    if (preg_match(&#39;#<meta[^>]*charset="?gb2312"[^>]*>#&#39;, $data)) {
      $data = iconv("gb2312","utf-8//IGNORE",$data);
      $data = preg_replace(&#39;#<meta[^>]*charset="?gb2312"[^>]*>#is&#39;, &#39;<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">&#39;, $data);
    }
    if (!preg_match(&#39;#<meta charset="utf-8"[^>]*>#is&#39;, $data)) {
      $data = str_replace(&#39;<head>&#39;, &#39;<head><meta http-equiv="Content-Type" content="text/html;charset=UTF-8">&#39;, $data);
    }
    if (preg_match(&#39;#<meta charset="utf-8"[^>]*>#is&#39;, $data)) {
      $data = preg_replace(&#39;#<meta charset="utf-8"[^>]*>#is&#39;, &#39;<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">&#39;, $data);
    }
    return $data;
  } else {
    return false;
  }
}
/**
 * 获取 DOMDocument 对象
 * @param str $url
 * @return boolean|DOM
 */
function getDom($url) {
  $html_content = curl_get($url);
  if(empty($html_content)) {
    //saveLog($url, &#39;请求失败&#39;);
    return false;
  }
  $dom = new DOMDocument(&#39;1.0&#39;, &#39;utf-8&#39;);
  libxml_use_internal_errors(true);
  $dom->loadHTML($html_content);
  return $dom;
}
$html_content = mb_convert_encoding($html_content, &#39;UTF-8&#39;, &#39;gb2312&#39;);

The above is the detailed content of How to solve the php dom garbled problem. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn