Home  >  Article  >  Backend Development  >  php: PHP automatically recognizes character sets and completes transcoding_PHP tutorial

php: PHP automatically recognizes character sets and completes transcoding_PHP tutorial

WBOY
WBOYOriginal
2016-07-14 10:08:46914browse

PHP automatically recognizes character sets and completes transcoding

Because the character encoding I use is generally UTF-8 encoding, but if the other party’s blog uses gb2312 encoding, the POST will be garbled (unless the other party converts the encoding before POSTing). When you cannot guarantee whether the other party must use UTF-8 encoding, it is necessary to do an encoding check and conversion yourself.
I wrote a function to complete this work. The principle is very simple, because gb2312/gbk is Chinese two bytes, these two bytes have a value range, and Chinese characters in utf-8 are three bytes, the same Each byte also has a value range. Regardless of the encoding situation, English is less than 128 and only occupies one byte (except full-width).
If it is an encoding check in the form of a file, you can also directly check the BOM information of utf-8. Regarding this aspect, you can take a look at the encoding conversion function of the TP toolbox. I wrote more details in the AppCodingSwitch class. annotation.
Without further ado, let’s go directly to the function. This function is used to check and transcode strings. File inspection and transcoding
[php]
function safeEncoding($string, $outEncoding = 'UTF-8') {
$encoding = "UTF-8";
for ($i = 0; $i < strlen($string); $i++) {
if (ord($string{$i}) < 128)
continue;
if ((ord($string{$i}) & 224) == 224) {
//The first byte passed
$char = $string{++$i};
if ((ord($char) & 128) == 128) {
                                                                                                                                                                                                           
$char = $string{++$i};
              if ((ord($char) & 128) == 128) {                                                                    
$encoding = "UTF-8";
break;
        }  
      }  
} }
if ((ord($string{$i}) & 192) == 192) {
//The first byte passed
$char = $string{++$i};
if ((ord($char) & 128) == 128) {
                                                                                                                                                                                                           
$encoding = "GB2312";
break;
      }  
} }
}
if (strtoupper($encoding) == strtoupper($outEncoding))
return $string;
else
returniconv($encoding, $outEncoding, $string);
}
http://www.bkjia.com/PHPjc/477773.html

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/477773.htmlTechArticlePHP automatically recognizes the character set and completes the transcoding because it uses character encoding. Generally, it is UTF-8 encoding, but if If the other party's blog uses gb2312 encoding, the POST will appear garbled (unless...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn