Home > Article > Backend Development > What should I do if big5 is converted to utf8 garbled code in PHP?
The solution to convert big5 to utf8 garbled characters in php: first generate the tab file, and ensure that the tab file does not exist when generating; then convert the specified page to test; then print out the text library; finally big5 convert [utf -8】That’s it.
Solution to convert big5 to utf8 garbled code in php:
The first step: generate tab file, when generating Make sure the tab file does not exist
writebig5UnicodeFile();
Step 2: Specify page transcoding test
testCode();
Step 3: Print out the text library
printfCode();
<?php //生成big5-unicode 编码文件 function loadBig5(){ $fp = fopen( './big5-unicode.txt', 'r' ); $big5_unicode_arr = array(); while($one_line = fgets($fp)) { $one_line_arr = explode("\t",$one_line); $big5 = hexdec(trim($one_line_arr[0])); $unicode = trim($one_line_arr[1]); if(strpos($unicode,',')) { $unicode = ltrim(explode(',',$unicode)[0],'<'); } $big5_unicode_arr[$big5] = hexdec($unicode); } return $big5_unicode_arr; } //追加形式写入文件 function putContent($content) { static $fp; if(!isset($fp)) { $fp = fopen( './big5-unicode-new.tab', 'a+' ); } fwrite($fp,$content); } //生成tab文件 function writebig5UnicodeFile() { $big5_unicode_arr = loadBig5(); $big5_unicod_content = array(); $min = 2000; $max = 0; $max_unicode = 0; foreach($big5_unicode_arr as $big5 => $unicode) { $h = floor($big5/256); $l = $big5%256; $index = ($h-135)*256*3+$l*3; if($index<$min) { $min = $index; } if($max<$index) { $max = $index; } if($unicode>$max_unicode) { $max_unicode = $unicode; } $h_1 = floor($unicode/65536); $h_2 = floor($unicode/256); $h_3 = $unicode%256; $big5_unicod_content[$index] = chr($h_1).chr($h_2).chr($h_3); } for($i=0;$i<=$max;$i=$i+3) { if(!isset($big5_unicod_content[$i])) { $big5_unicod_content[$i] = chr(0).chr(0).chr(0); } } for($i=0;$i<=$max;$i=$i+3) { if(strlen($big5_unicod_content[$i]) == 3) { putContent($big5_unicod_content[$i]); }else{ die('error'); } } } //测试编辑结果 function testCode() { $content = file_get_contents( './temlate_2.html'); echo b2u($content); } //打印出编码库文字 function printfCode() { $fp = fopen( './big5-unicode-new.tab', 'r' ); $len = filesize('./big5-unicode-new.tab'); $x = 0; $outstr = array(); // fseek( $fp, 21000 - 900 + 42*3); for($i=$x=0;$i<$len;$i=$i+3) { $uni = fread( $fp, 3 ); $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]); if($codenum == 0) { $outstr[$x++] = ' '; }elseif( $codenum < 0x80 ) { $outstr[$x++] = chr($codenum); }elseif($codenum < 0x800) { $outstr[$x++] = chr( 192 + $codenum / 64 ); $outstr[$x++] = chr( 128 + $codenum % 64 ); }elseif($codenum < 0x10000){ $outstr[$x++] = chr( 224 + floor($codenum / 4096 )); $codenum = $codenum%4096; $outstr[$x++] = chr( 128 + floor($codenum / 64 )); $outstr[$x++] = chr( 128 + ($codenum % 64) ); }else{ $outstr[$x++] = chr( 240 + floor($codenum / 262144 )); $codenum = $codenum%262144; $outstr[$x++] = chr( 128 + floor($codenum / 4096 )); $codenum = $codenum%4096; $outstr[$x++] = chr( 128 + ($codenum / 64) ); $outstr[$x++] = chr( 128 + ($codenum % 64) ); } } echo join( '', $outstr); } //big5 转 utf-8 function b2u( $instr ) { $fp = fopen( './big5-unicode-new.tab', 'r' ); $len = strlen($instr); $outstr = ''; for( $i = $x = 0 ; $i < $len ; $i++ ) { $h = ord($instr[$i]); if( $h >= 135 ) { $l = ord($instr[$i+1]); fseek( $fp, ($h-135)*256*3+$l*3 ); $uni = fread( $fp, 3 ); $codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]); if($codenum == 0) { $outstr[$x++] = ' '; }elseif( $codenum < 0x80 ) { $outstr[$x++] = chr($codenum); }elseif($codenum < 0x800) { $outstr[$x++] = chr( 192 + $codenum / 64 ); $outstr[$x++] = chr( 128 + $codenum % 64 ); }elseif($codenum < 0x10000){ $outstr[$x++] = chr( 224 + floor($codenum / 4096 )); $codenum = $codenum%4096; $outstr[$x++] = chr( 128 + floor($codenum / 64 )); $outstr[$x++] = chr( 128 + ($codenum % 64) ); }else{ $outstr[$x++] = chr( 240 + floor($codenum / 262144 )); $codenum = $codenum%262144; $outstr[$x++] = chr( 128 + floor($codenum / 4096 )); $codenum = $codenum%4096; $outstr[$x++] = chr( 128 + ($codenum / 64) ); $outstr[$x++] = chr( 128 + ($codenum % 64) ); } $i++; } else $outstr[$x++] = $instr[$i]; } fclose($fp); if( $instr != '' ) return join( '', $outstr); }
Related learning recommendations: PHP programming from entry to proficiency
The above is the detailed content of What should I do if big5 is converted to utf8 garbled code in PHP?. For more information, please follow other related articles on the PHP Chinese website!