We know that sometimes when substr is used to intercept UTF-8 Chinese strings, garbled characters often appear. Why does such a problem occur? This article tells you the answer.
Look at this piece of code (character encoding is UTF-8):
Copy code Code As follows:
$str = 'Everyone knows that strlen and mb_strlen are functions to find the length of a string';
echo strlen($str)'.
'.mb_strlen($str,'utf-8');
?>
Run the above code and the return value is as follows:
66
34
How about it? In strlen, Chinese is three bytes in length, and English is one byte in length! In mb_strlen, they are all calculated as the length of one byte! Therefore, when we sometimes use substr to intercept UTF-8 Chinese strings, garbled characters often appear. This is the reason!
The following provides a function to intercept UTF-8 strings:
Copy the code The code is as follows:
function cutstr( $sourcestr,$cutlength){
$returnstr = '';
$i = 0;
$n = 0;
$str_length = strlen($sourcestr);
$mb_str_length = mb_strlen($sourcestr,'utf-8');
while(($n < $cutlength) && ($i <= $str_length)){
$temp_str = substr($sourcestr,$i ,1);
$ascnum = ord($temp_str);
if($ascnum >= 224){
$returnstr = $returnstr.substr($sourcestr,$i,3);
$i = $i + 3;
$n++;
}
elseif($ascnum >= 192){
$returnstr = $returnstr.substr($sourcestr,$i, 2);
$i = $i + 2;
$n++;
}
elseif(($ascnum >= 65) && ($ascnum <= 90)){
$returnstr = $returnstr.substr($sourcestr,$i,1);
$i = $i + 1;
$n++;
}
else{
$returnstr = $returnstr.substr($sourcestr,$i,1);
$i = $i + 1;
$n = $n + 0.5;
}
}
if ($ mb_str_length > $cutlength){
$returnstr = $returnstr . "...";
}
return $returnstr;
}
Usage example:
Copy code The code is as follows:
$str = 'The validity period is up to three months, beyond the validity period The system will automatically delete this message';
//echo strlen($str);
//echo '
'.mb_strlen($str,'utf-8');
echo '
'.$str;
echo '
'.cutstr($str,24);
?>
http://www.bkjia.com/PHPjc/327744.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/327744.htmlTechArticleWe know that sometimes when substr is used to intercept UTF-8 Chinese strings, garbled characters often appear. Why? If such a question arises, this article will tell you the answer. Look at this piece of code...