首頁  >  文章  >  後端開發  >  php中big5轉utf8亂碼怎麼辦?

php中big5轉utf8亂碼怎麼辦?

coldplay.xixi
coldplay.xixi原創
2020-07-13 13:24:032587瀏覽

php中big5轉utf8亂碼的解決方法:先產生tab文件,並產生時要保證tab檔案不存在;然後將指定頁數轉碼測試;接著列印出文字庫;最後big5轉【utf -8】即可。

php中big5轉utf8亂碼怎麼辦?

php中big5轉utf8亂碼的解決方法:

第一步:產生tab檔,產生時要確保tab檔案不存在才可以

writebig5UnicodeFile();

第二步:指定頁面轉碼測試

#testCode();

 

第三步:印出文字庫

printfCode();

<?php
//生成big5-unicode 编码文件
function loadBig5(){
    $fp = fopen( &#39;./big5-unicode.txt&#39;, &#39;r&#39; );
    $big5_unicode_arr = array();
    while($one_line = fgets($fp)) {
        $one_line_arr = explode("\t",$one_line);
        $big5 = hexdec(trim($one_line_arr[0]));
        $unicode = trim($one_line_arr[1]);
        if(strpos($unicode,&#39;,&#39;)) {
            $unicode = ltrim(explode(&#39;,&#39;,$unicode)[0],&#39;<&#39;);
        }
        
        $big5_unicode_arr[$big5] = hexdec($unicode);
    }
    
    return $big5_unicode_arr;
}
 
//追加形式写入文件
function putContent($content) {
    static $fp;
    if(!isset($fp)) {
        $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;a+&#39; );
    }
    
    fwrite($fp,$content);
}
 
//生成tab文件
function writebig5UnicodeFile() {
    $big5_unicode_arr = loadBig5();
    $big5_unicod_content = array();
    $min = 2000;
    $max = 0;
    $max_unicode = 0;
    foreach($big5_unicode_arr as $big5 => $unicode) {
        $h = floor($big5/256);
        $l = $big5%256;
        $index = ($h-135)*256*3+$l*3;
        
        if($index<$min) {
            $min = $index;
        }
        
        if($max<$index) {
            $max = $index;
        }
        
        if($unicode>$max_unicode) {
            $max_unicode = $unicode;
        }
        
        $h_1 = floor($unicode/65536);
        $h_2 = floor($unicode/256);
        $h_3 = $unicode%256;
        
        $big5_unicod_content[$index] = chr($h_1).chr($h_2).chr($h_3);
    }
    
    for($i=0;$i<=$max;$i=$i+3) {
        if(!isset($big5_unicod_content[$i])) {
            $big5_unicod_content[$i] = chr(0).chr(0).chr(0);
        }
    }
    
    for($i=0;$i<=$max;$i=$i+3) {
        if(strlen($big5_unicod_content[$i]) == 3) {
            putContent($big5_unicod_content[$i]);
        }else{
            die(&#39;error&#39;);
        }
    }
}
 
//测试编辑结果
function testCode() {
    $content = file_get_contents( &#39;./temlate_2.html&#39;);
    echo b2u($content);
}
 
//打印出编码库文字
function printfCode() {
    $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;r&#39; );
    $len = filesize(&#39;./big5-unicode-new.tab&#39;);
    $x = 0;
    $outstr = array();
    //     fseek( $fp, 21000 - 900 + 42*3);
    for($i=$x=0;$i<$len;$i=$i+3) {
        $uni = fread( $fp, 3 );
        $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
        if($codenum == 0) {
            $outstr[$x++] = &#39; &#39;;
        }elseif( $codenum < 0x80 ) {
            $outstr[$x++] = chr($codenum);
        }elseif($codenum < 0x800) {
            $outstr[$x++] = chr( 192 + $codenum / 64 );
            $outstr[$x++] = chr( 128 + $codenum % 64 );
 
        }elseif($codenum < 0x10000){
            $outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
            $codenum = $codenum%4096;
            $outstr[$x++] = chr( 128 + floor($codenum / 64 ));
            $outstr[$x++] = chr( 128 + ($codenum % 64) );
        }else{
            $outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
            $codenum = $codenum%262144;
            $outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
            $codenum = $codenum%4096;
            $outstr[$x++] = chr( 128 + ($codenum / 64) );
            $outstr[$x++] = chr( 128 + ($codenum % 64) );
        }
    }
 
    echo join( &#39;&#39;, $outstr);
}
 
//big5 转 utf-8
function b2u( $instr ) {
    $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;r&#39; );
    $len = strlen($instr);
    $outstr = &#39;&#39;;
    for( $i = $x = 0 ; $i < $len ; $i++ ) {
        $h = ord($instr[$i]);
        if( $h >= 135 ) {
            $l = ord($instr[$i+1]);
            
            fseek( $fp, ($h-135)*256*3+$l*3 );
            $uni = fread( $fp, 3 );
            
            $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
            
            if($codenum == 0) {
                $outstr[$x++] = &#39; &#39;;
            }elseif( $codenum < 0x80 ) {
                $outstr[$x++] = chr($codenum);
            }elseif($codenum < 0x800) {
                $outstr[$x++] = chr( 192 + $codenum / 64 );
                $outstr[$x++] = chr( 128 + $codenum % 64 );
                
            }elseif($codenum < 0x10000){
                $outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
                $codenum = $codenum%4096;
                $outstr[$x++] = chr( 128 + floor($codenum / 64 ));
                $outstr[$x++] = chr( 128 + ($codenum % 64) );
            }else{
                $outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
                $codenum = $codenum%262144;
                $outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
                $codenum = $codenum%4096;
                $outstr[$x++] = chr( 128 + ($codenum / 64) );
                $outstr[$x++] = chr( 128 + ($codenum % 64) );
            }
            $i++;
        }
        else
            $outstr[$x++] = $instr[$i];
    }
    fclose($fp);
    if( $instr != &#39;&#39; )
        return join( &#39;&#39;, $outstr);
}

相關學習推薦:PHP程式設計從入門到精通

以上是php中big5轉utf8亂碼怎麼辦?的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn