search
HomeBackend DevelopmentPHP Tutorialphp 获取数据乱码有关问题

php 获取数据乱码问题

本帖最后由 qq_25293153 于 2015-01-13 15:42:18 编辑 file_get_contents 采集一个页面的数据,获取的数据是乱码,已经使用了检测编码的方式,
检测的是utf-8,我的页面编码也是utf-8,但是还是显示乱码,不知道为什么

<br /><br />$url="xxx";<br />$opts = array( <br />  'http'=>array( <br />    'user_agent' => "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)",<br />  ) <br />); <br />$context = stream_context_create($opts); <br />$neirong = file_get_contents($url, false, $context);<br /><br /> header("content-Type: text/html; charset=Utf-8"); <br />	 ob_end_flush();<br />	 $encode = mb_detect_encoding($neirong, array("ASCII","UTF-8","GB2312","GBK","BIG5")); <br />	 <br />        echo $encode."<br>";<br />		<br />		if ($encode!="UTF-8")<br />		{<br />     $neirong=mb_convert_encoding($neirong, "UTF-8", $encode); <br />	  <br />	 } <br />	  <br />	 <br /><br />echo $neirong;<br /><br /><br /><br />


 $encode  输出:utf-8
$neirong 输出是乱码
我的页面编码是utf-8
------解决思路----------------------
你在輸出的html中加入

它源數據是做了些轉換的,我那個程序已經是轉換過來了。

我把採集的也寫出來,直接運行就可以了。
<br /><?php  <br />//http://www.ziyouge.com/conbdhekbefiab<br />//http://www.ziyouge.com/zy/4/4980/1333249.html<br /><br />// 獲取程序<br /><br />$url = 'http://www.ziyouge.com/conbdhekbefiab';<br /><br />$headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36';<br />$headerArr = array();  <br />foreach( $headers as $n => $v ) {  <br />    $headerArr[] = $n .':' . $v;   <br />}<br /><br />$ch = curl_init();<br />curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);	<br />curl_setopt($ch, CURLOPT_URL, $url);<br />curl_setopt($ch, CURLOPT_HTTPHEADER , $headerArr );  //构造IP<br />curl_setopt($ch, CURLOPT_REFERER, 'http://www.ziyouge.com/');   //构造来路<br />$content = curl_exec($ch);<br />$content = substr($content,3);<br />if($error=curl_error($ch)){<br />	die($error);<br />}<br />curl_close($ch);<br /><br />// 分析程序<br /><br />$result = '';<br />$str_length = mb_strlen($content);<br />$i=0;<br />while ($i<=$str_length)<br />{<br />	$temp_str=mb_substr($content,$i,1);<br />	$ascnum=Ord($temp_str);<br />	if ($ascnum>=224){<br />		$result .= change(mb_substr($content,$i,3));<br />		$i=$i+3;<br />	}else{<br />		$result .= mb_substr($content,$i,1);<br />		$i=$i+1;<br />	}<br />}<br /><br />echo '<meta http-equiv="content-type" content="text/html;charset=utf-8">';<br />echo $result;<br /><br />// 處理<br />function change($str){<br /><br />	$ignore = array('“','”','!','…',':',',',',');<br /><br />	if(in_array($str, $ignore)){<br />		return $str;<br />	}<br /><br />	$prefix = "%u";<br />	$postfix = "";<br />    $str = iconv('UTF-8', 'UCS-2', $str);<br />    $arrstr = str_split($str, 2);<br />    $unistr = '';<br />    for($i = 0, $len = count($arrstr); $i < $len; $i++) {<br />        $tmp = hexdec(bin2hex($arrstr[$i]));<br />		$tmp = str_pad(dechex($tmp),4,'0',STR_PAD_LEFT);<br />		$tmp = decrypt(substr($tmp,2,2).substr($tmp,0,2));<br />		$unistr .= $prefix . $tmp . $postfix;<br />    }<br />	return unescape($unistr);<br />}<br /><br />// 解密<br />function decrypt($d){<br />	$result = str_pad(dechex(hexdec($d)-100),4,'0',STR_PAD_LEFT);<br />	return $result;<br />}<br /><br />// 轉中文<br />function unescape($str) {  <br />    $ret = '';  <br />    $len = strlen ( $str );  <br />    for($i = 0; $i < $len; $i ++) {  <br />        if ($str [$i] == '%' && $str [$i + 1] == 'u') {  <br />            $val = hexdec ( substr ( $str, $i + 2, 4 ) );  <br />            if ($val < 0x7f)  <br />                $ret .= chr ( $val );  <br />            else if ($val < 0x800)  <br />                $ret .= chr ( 0xc0 <br><font color='#FF8000'>------解决思路----------------------</font><br> ($val >> 6) ) . chr ( 0x80 <br><font color='#FF8000'>------解决思路----------------------</font><br> ($val & 0x3f) );  <br />            else  <br />                $ret .= chr ( 0xe0 <br><font color='#FF8000'>------解决思路----------------------</font><br> ($val >> 12) ) . chr ( 0x80 <br><font color='#FF8000'>------解决思路----------------------</font><br> (($val >> 6) & 0x3f) ) . chr ( 0x80 <br><font color='#FF8000'>------解决思路----------------------</font><br> ($val & 0x3f) );  <br />            $i += 5;  <br />        } else if ($str [$i] == '%') {  <br />            $ret .= urldecode ( substr ( $str, $i, 3 ) );  <br />            $i += 2;  <br />        } else  <br />            $ret .= $str [$i];  <br />    }  <br />    return $ret;  <br />}   <br />?> <br />

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How can you check if a PHP session has already started?How can you check if a PHP session has already started?Apr 30, 2025 am 12:20 AM

In PHP, you can use session_status() or session_id() to check whether the session has started. 1) Use the session_status() function. If PHP_SESSION_ACTIVE is returned, the session has been started. 2) Use the session_id() function, if a non-empty string is returned, the session has been started. Both methods can effectively check the session state, and choosing which method to use depends on the PHP version and personal preferences.

Describe a scenario where using sessions is essential in a web application.Describe a scenario where using sessions is essential in a web application.Apr 30, 2025 am 12:16 AM

Sessionsarevitalinwebapplications,especiallyfore-commerceplatforms.Theymaintainuserdataacrossrequests,crucialforshoppingcarts,authentication,andpersonalization.InFlask,sessionscanbeimplementedusingsimplecodetomanageuserloginsanddatapersistence.

How can you manage concurrent session access in PHP?How can you manage concurrent session access in PHP?Apr 30, 2025 am 12:11 AM

Managing concurrent session access in PHP can be done by the following methods: 1. Use the database to store session data, 2. Use Redis or Memcached, 3. Implement a session locking strategy. These methods help ensure data consistency and improve concurrency performance.

What are the limitations of using PHP sessions?What are the limitations of using PHP sessions?Apr 30, 2025 am 12:04 AM

PHPsessionshaveseverallimitations:1)Storageconstraintscanleadtoperformanceissues;2)Securityvulnerabilitieslikesessionfixationattacksexist;3)Scalabilityischallengingduetoserver-specificstorage;4)Sessionexpirationmanagementcanbeproblematic;5)Datapersis

Explain how load balancing affects session management and how to address it.Explain how load balancing affects session management and how to address it.Apr 29, 2025 am 12:42 AM

Load balancing affects session management, but can be resolved with session replication, session stickiness, and centralized session storage. 1. Session Replication Copy session data between servers. 2. Session stickiness directs user requests to the same server. 3. Centralized session storage uses independent servers such as Redis to store session data to ensure data sharing.

Explain the concept of session locking.Explain the concept of session locking.Apr 29, 2025 am 12:39 AM

Sessionlockingisatechniqueusedtoensureauser'ssessionremainsexclusivetooneuseratatime.Itiscrucialforpreventingdatacorruptionandsecuritybreachesinmulti-userapplications.Sessionlockingisimplementedusingserver-sidelockingmechanisms,suchasReentrantLockinJ

Are there any alternatives to PHP sessions?Are there any alternatives to PHP sessions?Apr 29, 2025 am 12:36 AM

Alternatives to PHP sessions include Cookies, Token-based Authentication, Database-based Sessions, and Redis/Memcached. 1.Cookies manage sessions by storing data on the client, which is simple but low in security. 2.Token-based Authentication uses tokens to verify users, which is highly secure but requires additional logic. 3.Database-basedSessions stores data in the database, which has good scalability but may affect performance. 4. Redis/Memcached uses distributed cache to improve performance and scalability, but requires additional matching

Define the term 'session hijacking' in the context of PHP.Define the term 'session hijacking' in the context of PHP.Apr 29, 2025 am 12:33 AM

Sessionhijacking refers to an attacker impersonating a user by obtaining the user's sessionID. Prevention methods include: 1) encrypting communication using HTTPS; 2) verifying the source of the sessionID; 3) using a secure sessionID generation algorithm; 4) regularly updating the sessionID.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function