PHP strlen() function
Definition and usage
strlen() function returns the length of a string.
Parameters: string
Description: Required. Specifies the string to check.
The code is as follows
<?php $str=‘中文a字1符‘; echo strlen($str); echo ‘<br />‘; echo mb_strlen($str,‘UTF8‘); //输出结果 //14 //6 ?>
Result analysis: When calculating strlen, a UTF8 Chinese character is treated as 3 lengths, so the length of "Chinese a character 1 character" is 3*4+2=14
When calculating mb_strlen, select If the internal code is UTF8, a Chinese character will be calculated as a length of 1, so the length of "Chinese a character 1 character" is 6
mb_strlen() function
It should be noted that mb_strlen is not a PHP core function , before use, you need to make sure that php_mbstring.dll is loaded in php.ini, that is, make sure that the line "extension=php_mbstring.dll" exists and is not commented out, otherwise the problem of undefined functions will occur.
The code is as follows
<?php $str=‘中文a字1符‘; //计算如下 echo (strlen($str) + mb_strlen($str,‘UTF8‘)) / 2; echo //输出结果 //10 ?>
The strlen($str) value of "Chinese a character 1 character" is 14, and the mb_strlen($str) value is 6. Then it can be calculated that the placeholder of "Chinese a character 1 character" is 10.
Explain the difference between the two
The code is as follows
<?php //测试时文件的编码方式要是UTF8 $str='中文a字1符'; echo strlen($str).'<br>';//14 echo mb_strlen($str,'utf8').'<br>';//6 echo mb_strlen($str,'gbk').'<br>';//8 echo mb_strlen($str,'gb2312').'<br>';//10 ?>
Although the above function can simply solve some problems of mixing Chinese and English, it cannot be used in actual practice. Let me introduce other better solutions to my friends
The implementation code for PHP to get the length of mixed Chinese and English strings is as follows, 1 Chinese = 1 digit, 2 English = 1 digit, you can modify it yourself
The code is as follows
/*** PHP获取字符串中英文混合长度 * @param $str string 字符串* @param $$charset string 编码* @return 返回长度,1中文=1位,2英文=1位*/function strLength($str,$charset='utf-8'){if($charset=='utf-8') $str = iconv('utf-8','gb2312',$str);$num = strlen($str);$cnNum = 0;for($i=0;$i<$num;$i++){if(ord(substr($str,$i+1,1))>127){$cnNum++;$i++;}}$enNum = $num-($cnNum*2);$number = ($enNum/2)+$cnNum;return ceil($number);} //测试输出长度都为15$str1 = '测试测试测试测试测试测试测试测';$str2 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';$str3 = 'aa测试aa测试aa测试aa测试aaaaaa';echo strLength($str1,'gb2312');echo strLength($str2,'gb2312');echo strLength($str3,'gb2312');
Intercept string function
UTF8 encoding, in In UTF8, one Chinese character occupies 3 bytes
The code is as follows
function msubstr($str, $start, $len) { $tmpstr = ""; $strlen = $start + $len; for($i = 0; $i < $strlen; $i++){ if(ord(substr($str, $i, 1)) > 127){ $tmpstr.=substr($str, $i, 3); $i+=2; }else $tmpstr.= substr($str, $i, 1); } return $tmpstr; } echo msubstr("一二三天下致公english",0,10);
GB2312 encoding, in gb2312, one Chinese character occupies 2 bytes
The code is as follows
<?php function msubstr($str, $start, $len) { //ȡ $tmpstr = ""; $strlen = $start + $len; if(preg_match('/[/d/s]{2,}/',$str)){$strlen=$strlen-2;} for($i = 0; $i < $strlen; $i++) { if(ord(substr($str, $i, 1)) > 0xa0) { $tmpstr .= substr($str, $i, 2); $i++; } else $tmpstr .= substr($str, $i, 1); } return $tmpstr; } ?>
Compatible The code of the good function
is as follows
function cc_msubstr($str, $start=0, $length, $charset="utf-8", $suffix=true) { if(function_exists("mb_substr")) return mb_substr($str, $start, $length, $charset); elseif(function_exists('iconv_substr')) { return iconv_substr($str,$start,$length,$charset); } $re['utf-8'] = "/[/x01-/x7f]|[/xc2-/xdf][/x80-/xbf]|[/xe0-/xef][/x80-/xbf]{2}|[/xf0-/xff] [/x80-/xbf]{3}/"; $re['gb2312'] = "/[/x01-/x7f]|[/xb0-/xf7][/xa0-/xfe]/"; $re['gbk'] = "/[/x01-/x7f]|[/x81-/xfe][/x40-/xfe]/"; $re['big5'] = "/[/x01-/x7f]|[/x81-/xfe]([/x40-/x7e]|/xa1-/xfe])/"; preg_match_all($re[$charset], $str, $match); $slice = join("",array_slice($match[0], $start, $length)); if($suffix) return $slice."…"; return $slice; }