The principle is very simple, because gb2312/gbk is Chinese two bytes, these two bytes have a value range, while Chinese characters in UTF-8 are three bytes, and each byte also has a value range. Regardless of the encoding situation, English is less than 128 and only occupies one byte (except full-width).
If it is an encoding check in the form of a file, you can also directly check the BOM information of utf-8. Without further ado, let’s go directly to the function. This function is used to check and transcode strings.
Copy code The code is as follows:
function safeEncoding($string,$outEncoding ='UTF -8')
{
$encoding = "UTF-8";
for($i=0;$i {
if (ord($string{$i})<128)
continue;
if((ord($string{$i})&224)==224)
{
//The first byte is passed
$char = $string{++$i};
if((ord($char)&128)==128)
$char = $string{++$i}; // The second byte is passed $char = $string{++$i};
if((ord($char)&128)==128)
$encoding = " UTF-8";
92)
{
> 🎜> if(strtoupper($encoding) == strtoupper($outEncoding))
return $string;
else
return iconv($encoding,$outEncoding,$string);
}
?>
http://www.bkjia.com/PHPjc/327896.html
www.bkjia.com
true
http: //www.bkjia.com/PHPjc/327896.html
TechArticle
The principle is very simple, because gb2312/gbk is Chinese two bytes, and these two bytes have a value range , and Chinese characters in UTF-8 are three bytes, and each byte also has a value range. And English no matter...
Statement:The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn