Home  >  Article  >  Backend Development  >  Common methods for PHP to detect whether a string is UTF8 encoded, utf8 encoding_PHP tutorial

Common methods for PHP to detect whether a string is UTF8 encoded, utf8 encoding_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:13:32852browse

Common method for PHP to detect whether a string is UTF8 encoded, utf8 encoding

The example in this article summarizes the common methods for PHP to detect whether a string is UTF8 encoded. Share it with everyone for your reference. The specific implementation method is as follows:

There are many ways to detect string encoding, such as using ord to get the base of the character and then entering the judgment, or using the mb_detect_encoding function to process it. Here are four common methods for your reference.

Example 1

Copy code The code is as follows:
/**
* Check whether the string is UTF8 encoded
* @param string $str The detected string
* @return boolean
*/
function is_utf8($str){
$len = strlen($str);
for($i = 0; $i < $len; $i++){
$c = ord($str[$i]);
if ($c > 128) {
if (($c > 247)) return false;
elseif ($c > 239) $bytes = 4;
elseif ($c > 223) $bytes = 3;
elseif ($c > 191) $bytes = 2;
else return false;
if (($i + $bytes) > $len) return false;
while ($bytes > 1) {
$i++;
$b = ord($str[$i]);
if ($b < 128 || $b > 191) return false;
$bytes--;
}
}
}
return true;
}

Example 2
Copy code The code is as follows:
function is_utf8($string) {
Return preg_match('%^(?:
                                                                                        [x09x0Ax0Dx20-x7E]                                                      | [xC2-xDF][x80-xBF] | [xC2-xDF][x80-xBF] # non-overlong 2-byte
                                                                                             | [xE1-xECxEExEF][x80-xBF]{2} # straight 3-byte
                                                                                                                                                                                                                                                                                                                                                                                                 )*$%xs', $string); }

The accuracy is basically the same as mb_detect_encoding(), both correct and wrong.
Coding detection cannot be 100% accurate, and this thing can basically meet the requirements.
Example 3


Copy code

The code is as follows:

function mb_is_utf8($string)
{ Return mb_detect_encoding($string, 'UTF-8') === 'UTF-8';//New discovery }
Example 4



Copy code
The code is as follows:


// Returns true if $string is valid UTF-8 and false otherwise.

function is_utf8($word) { if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{ 1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228 )."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-". chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."] {1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2 ,}/",$word) == true) { return true;
}
else
{
return false;
}
} // function is_utf8

I hope this article will be helpful to everyone’s PHP programming design.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/915431.htmlTechArticleCommon methods for PHP to detect whether a string is UTF8 encoded. UTF8 encoding This article summarizes the example of PHP detecting whether a string is Common methods of UTF8 encoding. Share it with everyone for your reference. Specific actual...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn