When using the mb_detect_encoding function in php for encoding identification, many people have encountered the problem of incorrect encoding, such as GB2312 and UTF-8, or UTF-8 and GBK (here mainly for cp936) Judgment), it is said on the Internet that because the characters are short, mb_detect_encoding will misjudge.
For example:
Copy code The code is as follows:
$encode = mb_detect_encoding($keytitle, array("ASCII" ,'UTF-8′,"GB2312′,"GBK",'BIG5′));
if ($encode == “UTF-8″){
$keytitle = iconv("UTF-8″ ,"GBK",$keytitle);
}
The function of this code is to detect whether the encoding of the string is UTF-8, and if so, convert it to GBK. But when $keytitle = "%D0%BE%C6%AC";. The detection result is UTF-8. This bug is not actually a bug, and you should not rely too much on mb_detect_encoding when writing programs. When the string is shorter, the detection result The possibility of deviation is very high.
How to solve it? My solution is:
Copy the code The code is as follows:
$encode = mb_detect_encoding($keytitle, array('ASCII','GB2312′,'GBK','UTF-8');
The three parameters are: the detected input The detection order of variables and encoding methods (once it is true, it is automatically ignored later), and the strict mode
adjusts the order of encoding detection to put the greatest possibility in the front, thus reducing the chance of incorrect conversion
General. To sort gb2312 first, when there are GBK and UTF-8, you need to sort the commonly used ones to the front
.
http://www.bkjia.com/PHPjc/323437.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/323437.htmlTechArticleWhen using the mb_detect_encoding function in php for encoding identification, many people have encountered the problem of incorrect encoding. , such as GB2312 and UTF-8, or UTF-8 and GBK (here mainly for...