Coding conversion between gb2312 and unicode The following example is to convert gb2312 to "full" form The iconv function after php4.3.1 is very easy to use, you just need to write a uft8 to unicode yourself The conversion function lookup table (gb2312.txt) can also be used
Copy code The code is as follows:
$text = "Script Home"; preg_match_all("/[x80-xff]?./",$text,$ar); foreach($ar[0] as $v) echo "".utf8_unicode(iconv("GB2312","UTF-8",$v)).";"; ?> // utf8 -> unicode function utf8_unicode($c) { switch(strlen($c)) { case 1: return ord($c); case 2: $n = (ord($c[0]) & 0x3f) << 6; $n += ord($c[1]) & 0x3f; return $n; case 3: $n = (ord($c[0]) & 0x1f) << 12; $n += (ord($c[1]) & 0x3f) << 6; $n += ord($c[2]) & 0x3f; return $n; case 4: $n = (ord($c[0]) & 0x0f ) << 18; $n += (ord($c[1]) & 0x3f) << 12; $n += (ord($c[2]) & 0x3f ) << 6; $n += ord($c[3]) & 0x3f; return $n; } } ?>
The following example uses PHP to convert the "full" encoding to gb2312.
Copy the code The code is as follows:
$str = "TTL all-weather auto focus"; $str = preg_replace("|([0-9]{1,5});|", " ".u2utf82gb(\1)."", $str); $str = "$str="$str";"; eval($str); echo $str; function u2utf82gb($c){ $str=""; if ($c < 0x80) { $str.=$c; } else if ($c < 0x800) { $str.=chr(0xC0 | $c>>6); $str.=chr(0x80 | $c & 0x3F); } else if ($c < 0x10000) { $str.=chr(0xE0 | $c>>12); $str.=chr(0x80 | $c>>6 & 0x3F); $str.= chr(0x80 | $c & 0x3F); } else if ($c < 0x200000) { $str.=chr(0xF0 | $c>>18); $str.= chr(0x80 | $c>>12 & 0x3F); $str.=chr(0x80 | $c>>6 & 0x3F); $str.=chr(0x80 | $c & 0x3F ); } return iconv('UTF-8', 'GB2312', $str); } ?>
or
Copy code The code is as follows:
function unescape($str) { $str = rawurldecode($str); preg_match_all("/(?:%u.{4})|.{4};|d+;|.+/U",$str,$r); $ar = $ r[0]; print_r($ar); foreach($ar as $k=>$v) { if(substr($v,0,2) == "%u ") $ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,-4))); elseif(substr($v ,0,3) == "") $ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,3,-1 ))); elseif(substr($v,0,2) == "") { echo substr($v,2,-1)." "; $ar[$k] = iconv("UCS-2","GB2312",pack("n",substr($v,2,-1))); } } return join("",$ar); } $str = "TTL all-weather automatic focus"; echo unescape($str); //out TTL all-weather automatic focus
Use javascript to convert
Copy code The code is as follows:
Unicode
下面是一个显示所有全角半角的字体的查看例子
复制代码 代码如下:
<script> <br>function showUni(min,max){ <br>show.document.open(); <br>show.document.writeln("<style>body{font-size:9pt;word-break:break-all;}</style>"); <br>show.document.writeln(min + " - " + max + "<br><br>"); <br>var i=0; <br>for(i=min;i<=max;i++){ <BR>show.document.write("&#" + i + ";"); <BR>} <BR>show.document.close(); <BR>} <BR></script> 自定义: -
下面是一个查表(gb2312),转换gb2312到utf8的例子, 现在有iconv函数,这个已经没有太大的意义了,
复制代码 代码如下:
function gb2utf8($gb){ if(!trim($gb)) return $gb; $filename="gb2312.txt"; $tmp=file($filename); $codetable=array(); while(list($key,$value)=each($tmp)) $codetable[hexdec(substr($value,0,6))]=substr($value,7,6); $utf8=""; while($gb) { if (ord(substr($gb,0,1))>127) { $this=substr($gb,0,2); $gb=substr($gb,2,strlen($gb)-2); $utf8.=u2utf8(hexdec($codetable[hexdec(bin2hex($this))-0x8080])); }else{ $this=substr($gb,0,1); $gb=substr($gb,1,strlen($gb)-1); $utf8.=u2utf8($this); } } return $utf8; } function u2utf8($c){ $str=""; if ($c < 0x80) { $str.=$c; } else if ($c < 0x800) { $str.=chr(0xC0 | $c>>6); $str.=chr(0x80 | $c & 0x3F); } else if ($c < 0x10000) { $str.=chr(0xE0 | $c>>12); $str.=chr(0x80 | $c>>6 & 0x3F); $str.=chr(0x80 | $c & 0x3F); } else if ($c < 0x200000) { $str.=chr(0xF0 | $c>>18); $str.=chr(0x80 | $c>>12 & 0x3F); $str.=chr(0x80 | $c>>6 & 0x3F); $str.=chr(0x80 | $c & 0x3F); } return $str; } ?>
http://www.bkjia.com/PHPjc/313788.html www.bkjia.com true http://www.bkjia.com/PHPjc/313788.html TechArticle gb2312 和 unicode 间的编码转换 下面的例子是将 gb2312 转换为 ""这种形式 php4.3.1以后的iconv函数很好用的,只是需要自己写一个uft8到unicode的转...