首页 >后端开发 >PHP问题 >php中big5转utf8乱码怎么办?

php中big5转utf8乱码怎么办?

coldplay.xixi
coldplay.xixi原创
2020-07-13 13:24:032635浏览

php中big5转utf8乱码的解决办法:首先生成tab文件,并生成时要保证tab文件不存在;然后将指定页面转码测试;接着打印出文字库;最后big5转【utf-8】即可。

php中big5转utf8乱码怎么办?

php中big5转utf8乱码的解决办法:

第一步:生成tab文件,生成时要保证tab文件不存在才可以

writebig5UnicodeFile();

 

第二步:指定页面转码测试

testCode();

 

第三步:打印出文字库

printfCode();

 

<?php
//生成big5-unicode 编码文件
function loadBig5(){
    $fp = fopen( &#39;./big5-unicode.txt&#39;, &#39;r&#39; );
    $big5_unicode_arr = array();
    while($one_line = fgets($fp)) {
        $one_line_arr = explode("\t",$one_line);
        $big5 = hexdec(trim($one_line_arr[0]));
        $unicode = trim($one_line_arr[1]);
        if(strpos($unicode,&#39;,&#39;)) {
            $unicode = ltrim(explode(&#39;,&#39;,$unicode)[0],&#39;<&#39;);
        }
        
        $big5_unicode_arr[$big5] = hexdec($unicode);
    }
    
    return $big5_unicode_arr;
}
 
//追加形式写入文件
function putContent($content) {
    static $fp;
    if(!isset($fp)) {
        $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;a+&#39; );
    }
    
    fwrite($fp,$content);
}
 
//生成tab文件
function writebig5UnicodeFile() {
    $big5_unicode_arr = loadBig5();
    $big5_unicod_content = array();
    $min = 2000;
    $max = 0;
    $max_unicode = 0;
    foreach($big5_unicode_arr as $big5 => $unicode) {
        $h = floor($big5/256);
        $l = $big5%256;
        $index = ($h-135)*256*3+$l*3;
        
        if($index<$min) {
            $min = $index;
        }
        
        if($max<$index) {
            $max = $index;
        }
        
        if($unicode>$max_unicode) {
            $max_unicode = $unicode;
        }
        
        $h_1 = floor($unicode/65536);
        $h_2 = floor($unicode/256);
        $h_3 = $unicode%256;
        
        $big5_unicod_content[$index] = chr($h_1).chr($h_2).chr($h_3);
    }
    
    for($i=0;$i<=$max;$i=$i+3) {
        if(!isset($big5_unicod_content[$i])) {
            $big5_unicod_content[$i] = chr(0).chr(0).chr(0);
        }
    }
    
    for($i=0;$i<=$max;$i=$i+3) {
        if(strlen($big5_unicod_content[$i]) == 3) {
            putContent($big5_unicod_content[$i]);
        }else{
            die(&#39;error&#39;);
        }
    }
}
 
//测试编辑结果
function testCode() {
    $content = file_get_contents( &#39;./temlate_2.html&#39;);
    echo b2u($content);
}
 
//打印出编码库文字
function printfCode() {
    $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;r&#39; );
    $len = filesize(&#39;./big5-unicode-new.tab&#39;);
    $x = 0;
    $outstr = array();
    //     fseek( $fp, 21000 - 900 + 42*3);
    for($i=$x=0;$i<$len;$i=$i+3) {
        $uni = fread( $fp, 3 );
        $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
        if($codenum == 0) {
            $outstr[$x++] = &#39; &#39;;
        }elseif( $codenum < 0x80 ) {
            $outstr[$x++] = chr($codenum);
        }elseif($codenum < 0x800) {
            $outstr[$x++] = chr( 192 + $codenum / 64 );
            $outstr[$x++] = chr( 128 + $codenum % 64 );
 
        }elseif($codenum < 0x10000){
            $outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
            $codenum = $codenum%4096;
            $outstr[$x++] = chr( 128 + floor($codenum / 64 ));
            $outstr[$x++] = chr( 128 + ($codenum % 64) );
        }else{
            $outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
            $codenum = $codenum%262144;
            $outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
            $codenum = $codenum%4096;
            $outstr[$x++] = chr( 128 + ($codenum / 64) );
            $outstr[$x++] = chr( 128 + ($codenum % 64) );
        }
    }
 
    echo join( &#39;&#39;, $outstr);
}
 
//big5 转 utf-8
function b2u( $instr ) {
    $fp = fopen( &#39;./big5-unicode-new.tab&#39;, &#39;r&#39; );
    $len = strlen($instr);
    $outstr = &#39;&#39;;
    for( $i = $x = 0 ; $i < $len ; $i++ ) {
        $h = ord($instr[$i]);
        if( $h >= 135 ) {
            $l = ord($instr[$i+1]);
            
            fseek( $fp, ($h-135)*256*3+$l*3 );
            $uni = fread( $fp, 3 );
            
            $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
            
            if($codenum == 0) {
                $outstr[$x++] = &#39; &#39;;
            }elseif( $codenum < 0x80 ) {
                $outstr[$x++] = chr($codenum);
            }elseif($codenum < 0x800) {
                $outstr[$x++] = chr( 192 + $codenum / 64 );
                $outstr[$x++] = chr( 128 + $codenum % 64 );
                
            }elseif($codenum < 0x10000){
                $outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
                $codenum = $codenum%4096;
                $outstr[$x++] = chr( 128 + floor($codenum / 64 ));
                $outstr[$x++] = chr( 128 + ($codenum % 64) );
            }else{
                $outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
                $codenum = $codenum%262144;
                $outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
                $codenum = $codenum%4096;
                $outstr[$x++] = chr( 128 + ($codenum / 64) );
                $outstr[$x++] = chr( 128 + ($codenum % 64) );
            }
            $i++;
        }
        else
            $outstr[$x++] = $instr[$i];
    }
    fclose($fp);
    if( $instr != &#39;&#39; )
        return join( &#39;&#39;, $outstr);
}

相关学习推荐:PHP编程从入门到精通

以上是php中big5转utf8乱码怎么办?的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

相关文章

查看更多