Home > Article > Backend Development > PHP automatically recognizes and converts text encoding
How does php automatically identify and convert text encoding? This article mainly introduces the method of PHP automatically identifying text encoding and converting it into the target encoding, involving PHP's judgment of the current encoding and the corresponding encoding conversion implementation skills. I hope to be helpful.
The details are as follows:
When PHP processes pages, we use functions such as iconv or mb_convert to convert character sets, but this actually has a premise. That is, we must know in advance what encoding in and out are so that we can perform the correct conversion.
Although most conversions are between gbk and utf-8, what should you do if you don’t know the encoding of the conversion object? Google has come up with such a function safeEncoding, which can easily identify the encoding of UTF8 and GBK . This function is very accurate to a certain extent, but it is not so easy to use in some more complex environments. Below I combine the differences between GBK and UTF-8 encoding and use regular expressions to determine UTF-8 Encode and use the mb_convert_encoding function to convert. In China, the most popular encodings are GBK and UTF-8, so this function automatically converts these two encodings.
/** * @ string 需要转换的文字 * @ encoding 目标编码 **/ function detect_encoding($string,$encoding = 'gbk'){ $is_utf8 = preg_match('%^(?:[\x09\x0A\x0D\x20-\x7E]| [\xC2-\xDF][\x80-\xBF]| \xE0[\xA0-\xBF][\x80-\xBF] | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} | \xED[\x80-\x9F][\x80-\xBF] | \xF0[\x90-\xBF][\x80-\xBF]{2} | [\xF1-\xF3][\x80-\xBF]{3} | \xF4[\x80-\x8F][\x80-\xBF]{2} )*$%xs', $string); if($is_utf8 && $encoding == 'utf8'){ return $string; }elseif($is_utf8){ return mb_convert_encoding($string, $encoding, "UTF-8"); }else{ return mb_convert_encoding($string, $encoding, 'gbk,gb2312,big5'); } }
Related recommendations:
php encoding conversion-character encoding conversion
php Method of garbled code transcoding to trigger access to url
utf-8-PHP The persistent problem of garbled codes--partially garbled codes
The above is the detailed content of PHP automatically recognizes and converts text encoding. For more information, please follow other related articles on the PHP Chinese website!