Home  >  Article  >  Backend Development  >  How to solve the problem of garbled characters when reading files in PHP

How to solve the problem of garbled characters when reading files in PHP

藏色散人
藏色散人Original
2020-11-20 09:51:102631browse

The solution to garbled Chinese characters when reading files in php: first open the corresponding code file; then use the "iconv($encodType, "utf-8", $content); method to solve the Chinese garbled characters.

How to solve the problem of garbled characters when reading files in PHP

Recommended: "PHP Video Tutorial"

PHP reads files and solves Chinese garbled UTF- 8

$opts = array(  
'file' => array(  
        'encoding' => "utf-8"  
  )  
);  
$opts = array('http' => array('encoding' => 'utf-8'));  
$ctxt = stream_context_create($opts);  
$content = file_get_contents($filePath, FILE_TEXT, $ctxt);

The simplest is to change GF2312→UTF-8

$str=iconv("gb2312", "utf-8", $str);

It doesn’t work

$content
 = mb_convert_encoding(
$content
, 
"UTF-8"
, 
"auto"
);

**************** ***************************The ugly dividing line tells everyone that the above is bad: the following is the correct method... Ha ha···********************************************** ************

define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));  
define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));  
define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));  
define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));  
define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));  
  
$text = file_get_contents($newPath);  
$first2 = substr($text, 0, 2);  
$first3 = substr($text, 0, 3);  
$first4 = substr($text, 0, 3);  
$encodType = "";  
if ($first3 == UTF8_BOM)  
    $encodType = 'UTF-8 BOM';  
else if ($first4 == UTF32_BIG_ENDIAN_BOM)  
    $encodType = 'UTF-32BE';  
else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)  
    $encodType = 'UTF-32LE';  
else if ($first2 == UTF16_BIG_ENDIAN_BOM)  
    $encodType = 'UTF-16BE';  
else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)  
    $encodType = 'UTF-16LE';  
  
$content = file_get_contents($newPath);  
  
$content = iconv($encodType, "utf-8", $content);

ULTIMATE EDITION·····

$text = file_get_contents($filePath);  
                        //$encodType = mb_detect_encoding($text);  
                        define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));  
                        define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));  
                        define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));  
                        define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));  
                        define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));  
                        $first2 = substr($text, 0, 2);  
                        $first3 = substr($text, 0, 3);  
                        $first4 = substr($text, 0, 3);  
                        $encodType = "";  
                        if ($first3 == UTF8_BOM)  
                            $encodType = 'UTF-8 BOM';  
                        else if ($first4 == UTF32_BIG_ENDIAN_BOM)  
                            $encodType = 'UTF-32BE';  
                        else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)  
                            $encodType = 'UTF-32LE';  
                        else if ($first2 == UTF16_BIG_ENDIAN_BOM)  
                            $encodType = 'UTF-16BE';  
                        else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)  
                            $encodType = 'UTF-16LE';  
  
                        //下面的判断主要还是判断ANSI编码的·  
                        if ($encodType == '') {//即默认创建的txt文本-ANSI编码的  
                            $content = iconv("GBK", "UTF-8", $text);  
                        } else if ($encodType == 'UTF-8 BOM') {//本来就是UTF-8不用转换  
                            $content = $text;  
                        } else {//其他的格式都转化为UTF-8就可以了  
                            $content = iconv($encodType, "UTF-8", $text);  
                        }

The ultimate edition or above·can adapt to the ANSI """ established by the Chinese operating windows system txt text of "``UTF-8"""Unicode"``····

The above is the detailed content of How to solve the problem of garbled characters when reading files in PHP. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn