search
Homephp教程php手册UTF-8转换成GB2312的全处理

  主题:将UTF-8编码的字符串转化成GB2312的编码,没有对应编码的字符串转化为DEC; 的格式。如 회=>회

  语言:PHP,Javascript

  内容:浏览器用Javascript中encodeURI函数将字符串(包含非GB2312中字符)编码,GET请求到服务器,页面编码均为GB2312,服务器PHP脚本将请求数据转换成GB2312表示。

  基础:

  1. 单独使用iconv函数只能转换GB2312字符,外文字符无法转换
  2. 没有现成的函数可以用
  3. bindec()函数:将二进制格式的"01"字符串转换为十进制数
  4. decbin()函数:将十进制数转换为二进制字符串,如decbin(224)="11100000"

  思路:因为UTF-8分别有1,2,3字节编码,中日韩文都是3字节编码,处理时根据字符编码中首字节大小区分字节数量。

  1.如首字节小于128,为ASCII码
  2.128~192,非UTF-8编码,且处理为ord();
  3. 192~224,  双字节UTF-8编码
  4. 224~240,三字节编码
  5. 240~248,四字节编码
  6. 。。。
  7. 对于三字节编码的尝试用iconv转换成GB2312
  8. 非GB2312的多字节字符,尝试把UTF-8转换成Unicode,再取到Unicode十进制值
  9. 可以考虑使用位运算,也可以用bindec()函数

  程序:

 function GetGB2312String($name)
 {
  $tostr = "";
  for($i=0;$i  {
   $curbin = ord(substr($name,$i,1));
   if($curbin    {
    $tostr .= substr($name,$i,1);
   }elseif($curbin     $str = substr($name,$i,1);
    $tostr .= "".ord($str).";";
   }elseif($curbin     $str = substr($name,$i,2);
    $tostr .= "".GetUnicodeChar($str).";";
    $i += 1;
   }elseif($curbin     $str = substr($name,$i,3);
    $gstr= iconv("UTF-8","GB2312",$str);
    if(!$gstr)
    {
     $tostr .= "".GetUnicodeChar($str).";";
    }else{
     $tostr .= $gstr;
    }
   
    $i += 2;
   }elseif($curbin     $str = substr($name,$i,4);
    $tostr .= "".GetUnicodeChar($str).";";
   
    $i += 3;
   }elseif($curbin     $str = substr($name,$i,5);
    $tostr .= "".GetUnicodeChar($str).";";
   
    $i += 4;
   }else{
    $str = substr($name,$i,6);
    $tostr .= "".GetUnicodeChar($str).";";
   
    $i += 5;
   }
  }
 
  return $tostr;
 }
 
 function GetUnicodeChar($str)
 {
  $temp = "";
  for($i=0;$i  {
   $x = decbin(ord(substr($str,$i,1)));
   if($i == 0)
   {
    $s = strlen($str)+1;
    $temp .= substr($x,$s,8-$s);
   }else{
    $temp .= substr($x,2,6);
   }
  }
 
  return bindec($temp);
 }

  附:

U-00000000 - U-0000007F:  0xxxxxxx 
U-00000080 - U-000007FF:  110xxxxx 10xxxxxx 
U-00000800 - U-0000FFFF:  1110xxxx 10xxxxxx 10xxxxxx 
U-00010000 - U-001FFFFF:  11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 
U-00200000 - U-03FFFFFF:  111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 
U-04000000 - U-7FFFFFFF:  1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx



Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)