Home > Article > Backend Development > Why is the return result of gbk encoding 3?
php > $s="Hello";
php > echo mb_strlen($s,"utf8");
2
utf8 returns 2, I understand
php > echo mb_strlen($s,"gb2312") ;
4
This returns 4, I understand it too
php > echo mb_strlen($s,"gbk");
3
I don't understand here?
php > $s="Hello";
php > echo mb_strlen($s,"utf8");
2
utf8 returns 2, I understand
php > echo mb_strlen($s,"gb2312") ;
4
This returns 4, I understand it too
php > echo mb_strlen($s,"gbk");
3
I don't understand here?
Because $s is UTF8 encoded, you can get its length through GBK encoding without converting it to GBK.
UTF8 encoded Hello
is HUAN犲ソ
on GBK, so its length is 3.
This is what you should do:
<code>$a = mb_strlen(iconv( 'utf-8','gbk', $s), 'gbk'); $b = mb_strlen(iconv( 'utf-8','gb2312', $s), 'gb2312'); </code>
In other words, GB2312 is also wrong.
mb_strlen is the number of characters returned, so only returning 2 is correct. I don’t know how you understand the two cases of 4 and 3?
But when $s = "Hello"
, $s
stores a UTF8 encoded string (encoded according to your source file). If you use GBK or GB2312 to decode this encoded data, It is possible to get garbled codes, so 4 and 3 should be the length of garbled codes.