Home >Backend Development >PHP Tutorial >Example of getting the character length of a utf8 string in php_PHP tutorial
When I was writing the form validation class of the framework tonight, I needed to determine whether the length of a certain string was within a specified range. Naturally, I thought of the strlen function in PHP.
The code is as follows | |||||
|
Test Chinese
The code is as follows | |||||
$str = 'Hello, world! ';
|
PHP’s built-in string length function strlen cannot correctly handle Chinese strings. All it gets is the number of bytes occupied by the string. For the Chinese encoding of GB2312, the value obtained by strlen is twice the number of Chinese characters, while for UTF-8 encoded Chinese, the difference is three times (under UTF-8 encoding, one Chinese character occupies 3 bytes).
The following example is taken from the famous WordPress. It is very accurate. It should also be noted that this function only applies to strings encoded in utf-8.
The code is as follows | |||||
|
But the above code cannot handle GBK/GB2312 Chinese strings under UTF-8 encoding, because the Chinese characters of GBK/GB2312 will be recognized as two characters and the calculated number of Chinese characters will double, so I I came up with this idea:
The code is as follows | |||||
$tmp = @iconv('gbk', 'utf-8', $str);
|
Compatible with GBK/GB2312 and UTF-8 encoding, passed the test with a small amount of data, but it is not yet confirmed whether it is completely correct