PHP解码unicode编码中文字符代码示例,
在抓取某网站数据,结果在数据包中发现了一串编码的数据:"......\u65b0\u6d6a\u5fae\u535a......", 这其实是中文被unicode编码后了的数据,想解码出中文来。
解决方案:
方案A(稳定版+推荐):
<span>function</span> replace_unicode_escape_sequence(<span>$match</span><span>) { </span><span>return</span> mb_convert_encoding(<span>pack</span>('H*', <span>$match</span>[1]), 'UTF-8', 'UCS-2BE'<span>); } </span><span>$name</span> = '\u65b0\u6d6a\u5fae\u535a'<span>; </span><span>$str</span> = <span>preg_replace_callback</span>('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', <span>$name</span><span>); </span><span>echo</span> <span>$str</span>; <span>//</span><span>输出: 新浪微博 // www.jbxue.com 脚本学堂 //咱将上述方案A给封装起来~~~(方案A稳定版+升级+推荐)</span> <span>class</span><span> Helper_Tool { </span><span>static</span> <span>function</span> unicodeDecode(<span>$data</span><span>) { </span><span>function</span> replace_unicode_escape_sequence(<span>$match</span><span>) { </span><span>return</span> mb_convert_encoding(<span>pack</span>('H*', <span>$match</span>[1]), 'UTF-8', 'UCS-2BE'<span>); } </span><span>$rs</span> = <span>preg_replace_callback</span>('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', <span>$data</span><span>); </span><span>return</span> <span>$rs</span><span>; } } </span><span>//</span><span>调用</span> <span>$name</span> = '\u65b0\u6d6a\u5fae\u535a'<span>; </span><span>$data</span> = Helper_Tool::unicodeDecode(<span>$name</span>); <span>//</span><span>输出新浪微博</span>
小贴士:多翻翻国外的php教程,很有帮助哦。
方案B(次推荐):
<?<span>php </span><span>function</span> unicodeDecode(<span>$name</span><span>){ </span><span>$json</span> = '{"str":"'.<span>$name</span>.'"}'<span>; </span><span>$arr</span> = json_decode(<span>$json</span>,<span>true</span><span>); </span><span>if</span>(<span>empty</span>(<span>$arr</span>)) <span>return</span> ''<span>; </span><span>return</span> <span>$arr</span>['str'<span>]; } // www.jbxue.com </span><span>$name</span> = '\u65b0\u6d6a\u5fae\u535a'<span>; </span><span>echo</span> unicodeDecode(<span>$name</span>); <span>//</span><span>输出: 新浪微博 </span>
对于方案B, 注意事项, 在好友 XAR (猛戳XAR博客) 的技术支持下,总结出要处理的字符串(即传递给函数unicodeDecode的参数$name的内容中一定不能包含单引号,否则就会导致解析失败, 所以有必要的话可以借助 str_replace()函数将非法字符格式化为合格字符)

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Atom editor mac version download
The most popular open source editor

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),