Home >Backend Development >PHP Tutorial >Use PHP to implement encoding conversion between gb2312 and unicode_PHP tutorial

Use PHP to implement encoding conversion between gb2312 and unicode_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 16:12:02826browse

Coding conversion between gb2312 and unicode
The following example is to convert gb2312 to "full" form
The iconv function after php4.3.1 is very easy to use, you just need to write a uft8 to unicode yourself The conversion function
lookup table (gb2312.txt) can also be used

Copy code The code is as follows:

$text = "Script Home";
preg_match_all("/[x80-xff]?./",$text,$ar);
foreach($ar[0] as $v)
echo "&#".utf8_unicode(iconv("GB2312","UTF-8",$v)).";";
?>
// utf8 -> unicode
function utf8_unicode($c) {
switch(strlen($c)) {
case 1:
return ord($c);
case 2:
$n = (ord($c[0]) & 0x3f) << 6;
$n += ord($c[1]) & 0x3f;
return $n;
case 3:
$n = (ord($c[0]) & 0x1f) << 12;
$n += (ord($c[1]) & 0x3f) << 6;
$n += ord($c[2]) & 0x3f;
return $n;
case 4:
$n = (ord($c[0]) & 0x0f ) << 18;
$n += (ord($c[1]) & 0x3f) << 12;
$n += (ord($c[2]) & 0x3f ) << 6;
$n += ord($c[3]) & 0x3f;
return $n;
}
}
?>

The following example uses PHP to convert the "full" encoding to gb2312.
Copy the code The code is as follows:

$str = "TTL all-weather auto focus";
$str = preg_replace("|&#([0-9]{1,5});|", " ".u2utf82gb(\1)."", $str);
$str = "$str="$str";";
eval($str);
echo $str;
function u2utf82gb($c){
$str="";
if ($c < 0x80) {
$str.=$c;
} else if ($c < 0x800) {
$str.=chr(0xC0 | $c>>6);
$str.=chr(0x80 | $c & 0x3F);
} else if ($c < 0x10000) {
$str.=chr(0xE0 | $c>>12);
$str.=chr(0x80 | $c>>6 & 0x3F);
$str.= chr(0x80 | $c & 0x3F);
} else if ($c < 0x200000) {
$str.=chr(0xF0 | $c>>18);
$str.= chr(0x80 | $c>>12 & 0x3F);
$str.=chr(0x80 | $c>>6 & 0x3F);
$str.=chr(0x80 | $c & 0x3F );
}
return iconv('UTF-8', 'GB2312', $str);
}
?>

or
Copy code The code is as follows:

function unescape($str) {
$str = rawurldecode($str);
preg_match_all("/(?:%u.{4})|.{4};|d+;|.+/U",$str,$r);
$ar = $ r[0];
print_r($ar);
foreach($ar as $k=>$v) {
if(substr($v,0,2) == "%u ")
$ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,-4)));
elseif(substr($v ,0,3) == "")
$ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,3,-1 )));
elseif(substr($v,0,2) == "") {
echo substr($v,2,-1)."
";
$ar[$k] = iconv("UCS-2","GB2312",pack("n",substr($v,2,-1)));
}
}
return join("",$ar);
}
$str = "TTL all-weather automatic focus";
echo unescape($str); //out TTL all-weather automatic focus

Use javascript to convert
Copy code The code is as follows:



Unicode

文本原型:




转换代码:




正向转换





下面是一个显示所有全角半角的字体的查看例子
复制代码 代码如下:


<script> <br>function showUni(min,max){ <br>show.document.open(); <br>show.document.writeln("<style>body{font-size:9pt;word-break:break-all;}</style>"); <br>show.document.writeln(min + " - " + max + "<br><br>"); <br>var i=0; <br>for(i=min;i<=max;i++){ <BR>show.document.write("&#" + i + ";"); <BR>} <BR>show.document.close(); <BR>} <BR></script>








自定义: -





下面是一个查表(gb2312),转换gb2312到utf8的例子, 现在有iconv函数,这个已经没有太大的意义了,
复制代码 代码如下:

function gb2utf8($gb){
if(!trim($gb)) return $gb;
$filename="gb2312.txt";
$tmp=file($filename);
$codetable=array();
while(list($key,$value)=each($tmp))
$codetable[hexdec(substr($value,0,6))]=substr($value,7,6);
$utf8="";
while($gb) {
if (ord(substr($gb,0,1))>127) {
$this=substr($gb,0,2);
$gb=substr($gb,2,strlen($gb)-2);
$utf8.=u2utf8(hexdec($codetable[hexdec(bin2hex($this))-0x8080]));
}else{
$this=substr($gb,0,1);
$gb=substr($gb,1,strlen($gb)-1);
$utf8.=u2utf8($this);
}
}
return $utf8;
}
function u2utf8($c){
$str="";
if ($c < 0x80) {
$str.=$c;
} else if ($c < 0x800) {
$str.=chr(0xC0 | $c>>6);
$str.=chr(0x80 | $c & 0x3F);
} else if ($c < 0x10000) {
$str.=chr(0xE0 | $c>>12);
$str.=chr(0x80 | $c>>6 & 0x3F);
$str.=chr(0x80 | $c & 0x3F);
} else if ($c < 0x200000) {
$str.=chr(0xF0 | $c>>18);
$str.=chr(0x80 | $c>>12 & 0x3F);
$str.=chr(0x80 | $c>>6 & 0x3F);
$str.=chr(0x80 | $c & 0x3F);
}
return $str;
}
?>

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/313788.htmlTechArticlegb2312 和 unicode 间的编码转换 下面的例子是将 gb2312 转换为 ""这种形式 php4.3.1以后的iconv函数很好用的,只是需要自己写一个uft8到unicode的转...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn