Home  >  Article  >  Backend Development  >  What should I do if big5 is converted to utf8 garbled code in PHP?

What should I do if big5 is converted to utf8 garbled code in PHP?

coldplay.xixi
coldplay.xixiOriginal
2020-07-13 13:24:032568browse

The solution to convert big5 to utf8 garbled characters in php: first generate the tab file, and ensure that the tab file does not exist when generating; then convert the specified page to test; then print out the text library; finally big5 convert [utf -8】That’s it.

What should I do if big5 is converted to utf8 garbled code in PHP?

Solution to convert big5 to utf8 garbled code in php:

The first step: generate tab file, when generating Make sure the tab file does not exist

writebig5UnicodeFile();

Step 2: Specify page transcoding test

testCode();

Step 3: Print out the text library

printfCode();

<?php
//生成big5-unicode 编码文件
function loadBig5(){
    $fp = fopen( &#39;./big5-unicode.txt&#39;, &#39;r&#39; );
    $big5_unicode_arr = array();
    while($one_line = fgets($fp)) {
        $one_line_arr = explode("\t",$one_line);
        $big5 = hexdec(trim($one_line_arr[0]));
        $unicode = trim($one_line_arr[1]);
        if(strpos($unicode,&#39;,&#39;)) {
            $unicode = ltrim(explode(&#39;,&#39;,$unicode)[0],&#39;<&#39;);
        }
        
        $big5_unicode_arr[$big5] = hexdec($unicode);
    }
    
    return $big5_unicode_arr;
}
 
//追加形式写入文件
function putContent($content) {
    static $fp;
    if(!isset($fp)) {
        $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;a+&#39; );
    }
    
    fwrite($fp,$content);
}
 
//生成tab文件
function writebig5UnicodeFile() {
    $big5_unicode_arr = loadBig5();
    $big5_unicod_content = array();
    $min = 2000;
    $max = 0;
    $max_unicode = 0;
    foreach($big5_unicode_arr as $big5 => $unicode) {
        $h = floor($big5/256);
        $l = $big5%256;
        $index = ($h-135)*256*3+$l*3;
        
        if($index<$min) {
            $min = $index;
        }
        
        if($max<$index) {
            $max = $index;
        }
        
        if($unicode>$max_unicode) {
            $max_unicode = $unicode;
        }
        
        $h_1 = floor($unicode/65536);
        $h_2 = floor($unicode/256);
        $h_3 = $unicode%256;
        
        $big5_unicod_content[$index] = chr($h_1).chr($h_2).chr($h_3);
    }
    
    for($i=0;$i<=$max;$i=$i+3) {
        if(!isset($big5_unicod_content[$i])) {
            $big5_unicod_content[$i] = chr(0).chr(0).chr(0);
        }
    }
    
    for($i=0;$i<=$max;$i=$i+3) {
        if(strlen($big5_unicod_content[$i]) == 3) {
            putContent($big5_unicod_content[$i]);
        }else{
            die(&#39;error&#39;);
        }
    }
}
 
//测试编辑结果
function testCode() {
    $content = file_get_contents( &#39;./temlate_2.html&#39;);
    echo b2u($content);
}
 
//打印出编码库文字
function printfCode() {
    $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;r&#39; );
    $len = filesize(&#39;./big5-unicode-new.tab&#39;);
    $x = 0;
    $outstr = array();
    //     fseek( $fp, 21000 - 900 + 42*3);
    for($i=$x=0;$i<$len;$i=$i+3) {
        $uni = fread( $fp, 3 );
        $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
        if($codenum == 0) {
            $outstr[$x++] = &#39; &#39;;
        }elseif( $codenum < 0x80 ) {
            $outstr[$x++] = chr($codenum);
        }elseif($codenum < 0x800) {
            $outstr[$x++] = chr( 192 + $codenum / 64 );
            $outstr[$x++] = chr( 128 + $codenum % 64 );
 
        }elseif($codenum < 0x10000){
            $outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
            $codenum = $codenum%4096;
            $outstr[$x++] = chr( 128 + floor($codenum / 64 ));
            $outstr[$x++] = chr( 128 + ($codenum % 64) );
        }else{
            $outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
            $codenum = $codenum%262144;
            $outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
            $codenum = $codenum%4096;
            $outstr[$x++] = chr( 128 + ($codenum / 64) );
            $outstr[$x++] = chr( 128 + ($codenum % 64) );
        }
    }
 
    echo join( &#39;&#39;, $outstr);
}
 
//big5 转 utf-8
function b2u( $instr ) {
    $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;r&#39; );
    $len = strlen($instr);
    $outstr = &#39;&#39;;
    for( $i = $x = 0 ; $i < $len ; $i++ ) {
        $h = ord($instr[$i]);
        if( $h >= 135 ) {
            $l = ord($instr[$i+1]);
            
            fseek( $fp, ($h-135)*256*3+$l*3 );
            $uni = fread( $fp, 3 );
            
            $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
            
            if($codenum == 0) {
                $outstr[$x++] = &#39; &#39;;
            }elseif( $codenum < 0x80 ) {
                $outstr[$x++] = chr($codenum);
            }elseif($codenum < 0x800) {
                $outstr[$x++] = chr( 192 + $codenum / 64 );
                $outstr[$x++] = chr( 128 + $codenum % 64 );
                
            }elseif($codenum < 0x10000){
                $outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
                $codenum = $codenum%4096;
                $outstr[$x++] = chr( 128 + floor($codenum / 64 ));
                $outstr[$x++] = chr( 128 + ($codenum % 64) );
            }else{
                $outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
                $codenum = $codenum%262144;
                $outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
                $codenum = $codenum%4096;
                $outstr[$x++] = chr( 128 + ($codenum / 64) );
                $outstr[$x++] = chr( 128 + ($codenum % 64) );
            }
            $i++;
        }
        else
            $outstr[$x++] = $instr[$i];
    }
    fclose($fp);
    if( $instr != &#39;&#39; )
        return join( &#39;&#39;, $outstr);
}

Related learning recommendations: PHP programming from entry to proficiency

The above is the detailed content of What should I do if big5 is converted to utf8 garbled code in PHP?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Related articles

See more