Home >Backend Development >PHP Tutorial >PHP determines the string length strlen() and mb_strlen() functions
strlen()
PHP strlen() function
Definition and usage
strlen() function returns the length of a string.
Syntax
strlen(string)
Parameters: string
Description: Required. Specifies the string to check.
The code is as follows
<?php $str=‘中文a字1符‘; echo strlen($str); echo ‘<br />‘; echo mb_strlen($str,‘UTF8‘); //输出结果 //14 //6 ?>
Result analysis: When calculating strlen, a UTF8 Chinese character is treated as 3 lengths, so the length of "Chinese a character 1 character" is 3*4+2=14
When calculating mb_strlen, select If the internal code is UTF8, a Chinese character will be calculated as a length of 1, so the length of "Chinese a character 1 character" is 6
mb_strlen() function
It should be noted that mb_strlen is not a PHP core function , before use, you need to make sure that php_mbstring.dll is loaded in php.ini, that is, make sure that the line "extension=php_mbstring.dll" exists and is not commented out, otherwise the problem of undefined functions will occur.
The code is as follows
<?php $str=‘中文a字1符‘; //计算如下 echo (strlen($str) + mb_strlen($str,‘UTF8‘)) / 2; echo //输出结果 //10 ?>
The strlen($str) value of "Chinese a character 1 character" is 14, and the mb_strlen($str) value is 6. Then it can be calculated that the placeholder of "Chinese a character 1 character" is 10.
Explain the difference between the two
The code is as follows
<?php //测试时文件的编码方式要是UTF8 $str='中文a字1符'; echo strlen($str).'<br>';//14 echo mb_strlen($str,'utf8').'<br>';//6 echo mb_strlen($str,'gbk').'<br>';//8 echo mb_strlen($str,'gb2312').'<br>';//10 ?>
Result analysis: When calculating strlen, a UTF8 Chinese character is treated as 3 lengths, so "Chinese a character 1 character" The length is 3*4+2=14. When calculating mb_strlen
, if the internal code is selected as UTF8, a Chinese character will be calculated as a length of 1, so the length of "Chinese a character 1 character" is 6.
Although the above function can simply solve some problems of mixing Chinese and English, it cannot be used in actual practice. Let me introduce other better solutions to my friends
.
The implementation code for PHP to get the length of mixed Chinese and English strings is as follows, 1 Chinese = 1 digit, 2 English = 1 digit, you can modify it yourself
The code is as follows
/*** PHP获取字符串中英文混合长度 * @param $str string 字符串* @param $$charset string 编码* @return 返回长度,1中文=1位,2英文=1位*/function strLength($str,$charset='utf-8'){if($charset=='utf-8') $str = iconv('utf-8','gb2312',$str);$num = strlen($str);$cnNum = 0;for($i=0;$i<$num;$i++){if(ord(substr($str,$i+1,1))>127){$cnNum++;$i++;}}$enNum = $num-($cnNum*2);$number = ($enNum/2)+$cnNum;return ceil($number);} //测试输出长度都为15$str1 = '测试测试测试测试测试测试测试测';$str2 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';$str3 = 'aa测试aa测试aa测试aa测试aaaaaa';echo strLength($str1,'gb2312');echo strLength($str2,'gb2312');echo strLength($str3,'gb2312');
Intercept string function
UTF8 encoding, in In UTF8, one Chinese character occupies 3 bytes
The code is as follows
function msubstr($str, $start, $len) { $tmpstr = ""; $strlen = $start + $len; for($i = 0; $i < $strlen; $i++){ if(ord(substr($str, $i, 1)) > 127){ $tmpstr.=substr($str, $i, 3); $i+=2; }else $tmpstr.= substr($str, $i, 1); } return $tmpstr; } echo msubstr("一二三天下致公english",0,10);
GB2312 encoding, in gb2312, one Chinese character occupies 2 bytes
The code is as follows
<?php function msubstr($str, $start, $len) { //ȡ $tmpstr = ""; $strlen = $start + $len; if(preg_match('/[/d/s]{2,}/',$str)){$strlen=$strlen-2;} for($i = 0; $i < $strlen; $i++) { if(ord(substr($str, $i, 1)) > 0xa0) { $tmpstr .= substr($str, $i, 2); $i++; } else $tmpstr .= substr($str, $i, 1); } return $tmpstr; } ?>
Compatible The code of the good function
is as follows
function cc_msubstr($str, $start=0, $length, $charset="utf-8", $suffix=true) { if(function_exists("mb_substr")) return mb_substr($str, $start, $length, $charset); elseif(function_exists('iconv_substr')) { return iconv_substr($str,$start,$length,$charset); } $re['utf-8'] = "/[/x01-/x7f]|[/xc2-/xdf][/x80-/xbf]|[/xe0-/xef][/x80-/xbf]{2}|[/xf0-/xff] [/x80-/xbf]{3}/"; $re['gb2312'] = "/[/x01-/x7f]|[/xb0-/xf7][/xa0-/xfe]/"; $re['gbk'] = "/[/x01-/x7f]|[/x81-/xfe][/x40-/xfe]/"; $re['big5'] = "/[/x01-/x7f]|[/x81-/xfe]([/x40-/x7e]|/xa1-/xfe])/"; preg_match_all($re[$charset], $str, $match); $slice = join("",array_slice($match[0], $start, $length)); if($suffix) return $slice."…"; return $slice; }