Home >Backend Development >PHP Tutorial >PHP program code to determine whether a string encoding is utf-8_PHP tutorial

PHP program code to determine whether a string encoding is utf-8_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:48:501226browse

This article will introduce to you the program code for PHP to determine whether the string encoding is utf-8. If you are interested, feel free to enter it for reference.

We used to use mb_detect_encoding() to detect character encoding

 代码如下 复制代码
//判断字符串是什么编码
if ($tag === mb_convert_encoding(mb_convert_encoding($tag, "GB2312", "UTF-8"), "UTF-8", "GB2312")) {
}
else {//如果是gb2312 的就转换为utf8的
$tag = mb_convert_encoding($tag, 'UTF-8', 'GB2312');
}

$keytitle = “%D0%BE%C6%AC”; The detection result is UTF-8. This bug is not actually a bug, and you should not rely too much on mb_detect_encoding when writing programs. When the string is short, the detection results are likely to be biased.
How to solve it? My solution is:

Adjust the order of encoding detection to put the greatest possibility first, thus reducing the chance of incorrect conversion.
The code is as follows
 代码如下 复制代码

$encode = mb_detect_encoding($keytitle, array('ASCII','GB2312′,'GBK','UTF-8');

Copy code


$encode = mb_detect_encoding($keytitle, array('ASCII','GB2312′,'GBK','UTF-8');

The
 代码如下 复制代码

// Returns true if $string is valid UTF-8 and false otherwise.
function is_utf8($word)
{
if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2,}/",$word) == true)
{
return true;
}
else
{
return false;
}
} // function is_utf8

parameters are: the input variable to be detected, the detection order of the encoding method (once it is true, it will be automatically ignored later), and the strict mode
The above method still can’t solve it, so I found another solution below.

Example 1

The code is as follows Copy code
// Returns true if $string is valid UTF-8 and false otherwise. function is_utf8($word) { if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{ 1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228 )."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-". chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."] {1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2 ,}/",$word) == true) { return true; } else { return false; } } // function is_utf8
http://www.bkjia.com/PHPjc/632765.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/632765.htmlTechArticleThis article will introduce to you the program code for PHP to determine whether the string encoding is utf-8. If If you are interested, please enter the reference. We used to use the mb_detect_encoding() function...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn