Home  >  Article  >  Backend Development  >  Detailed explanation of PHP automatically identifying character sets and completing transcoding_PHP tutorial

Detailed explanation of PHP automatically identifying character sets and completing transcoding_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:00:04736browse

Because the character encoding I use is generally UTF-8 encoding, but if the other party’s blog uses gb2312 encoding, the POST will appear garbled (unless the other party converts the encoding before POSTing). When you cannot guarantee whether the other party must use UTF-8 encoding, it is necessary to do an encoding check and conversion yourself.

Written a function to complete this work. The principle is very simple, because gb2312/gbk is Chinese two bytes, these two bytes have a value range, and Chinese characters in utf-8 are three bytes, the same Each byte also has a value range. Regardless of the encoding situation, English is less than 128 and only occupies one byte (except full-width).

If it is an encoding check in the form of a file, you can also directly check the BOM information of utf-8. Regarding this aspect, you can take a look at the encoding conversion function of the TP toolbox. I wrote more details in the AppCodingSwitch class. annotation.

Without further ado, let’s get straight to the function. This function is used to check and transcode strings. File inspection and transcoding

Copy code The code is as follows:

function safeEncoding($string, $outEncoding = 'UTF- 8') {
$encoding = "UTF-8";
for ($i = 0; $i < strlen($string); $i++) {
if (ord($string{ $i}) < 128)
continue;

If ((ord($string{$i}) & 224) == 224) {
//The first byte is passed
$char = $string{++$i};
                                                                                                                                                                                                                                                                                                    ((ord($char) & 128) == 128) {
                                                                                                             }
}
if (( ord($string{$i}) & 192) == 192) {
                                                                                                                                                                                                                                                                             ord($char) & 128) == 128) {
                                                                                                                                                 🎜> }
}

if (strtoupper($encoding) == strtoupper($outEncoding))
return $string;
else
returniconv($encoding, $outEncoding, $string);
}





http://www.bkjia.com/PHPjc/328089.html

www.bkjia.com

true

http: //www.bkjia.com/PHPjc/328089.html
TechArticle

Because the character encoding I use is generally utf-8 encoding, but if the other party’s blog uses gb2312 encoding, POST When it comes over, garbled characters will appear (unless the other party converts the encoding before POSTing). No...

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn