Home  >  Article  >  Backend Development  >  mb_substr_PHP tutorial

mb_substr_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:51:07856browse



Question
mb_substr
Solution
$string = "People's Republic of China";

$mystring=mb_substr($string,0,6,'UTF-8');

echo $mystring;


Copy the code and see it in the book: Under UTF-8 encoding, one Chinese character occupies 3 bytes; under GB2312/GBK encoding, one Chinese character occupies 2 bytes

So should the above code output the word "中华"?


Reference answer
Output "People's Republic of Central"
Reference answer
That 6 is the number of characters, not the number of bytes.
Please read the manual carefully-.-
Reference answer
'UTF-8' Remove this and have a look...
Reference answer
The original post was published by An on 2008-11-7 13:28 [url=http://bbs.111cn.cn/redirect.php?goto=findpost&pid=698924&ptid=89149]Link tag [img]http://bbs. 111cn.cn/images/common/back.gif[/img][/url]
'UTF-8' Remove this and see...
If the encoding parameter is not passed in,
I remember that mbstring.internal_encoding
under ini is used by default. If mbstring.internal_encoding is not set, Latin-1 (iso-8859-1) should be used
Reference answer
The original post was published by An at 2008-11-7 13:28[url=http://www.111cn.cn/bbs/redirect.php?goto=findpost&pid=698924&ptid=89149]Link tag[img]http:// www.111cn.cn/bbs/images/common/back.gif[/img][/url]
'UTF-8' Remove this and see...
In this case, a Chinese character takes up two bytes
For example: echo $mystring=mb_substr($string,0,4);//Result: China
echo $mystring=mb_substr($string,0,3 or 5);//Result: China
Reference answer
The original post was published by An on 2008-11-7 13:28[url=http://www.111cn.cn/bbs/redirect.php?goto=findpost&pid=698924&ptid=89149]Link tag[img]http:// www.111cn.cn/bbs/images/common/back.gif[/img][/url]
'UTF-8' Remove this and see...
Sorry, my mistake, my page is GB2312- -

After the page is changed to UTF8, remove 'UTF-8' and it will be three bytes per Chinese character. Thank you for your help
Reference answer
The original post was posted by a man on 2008-11-7 16:31 [url=http://bbs.111cn.cn/redirect.php?goto=findpost&pid=699902&ptid=89149]Link tag [img]http://bbs. 111cn.cn/images/common/back.gif[/img][/url]
Sorry, my mistake, my page is GB2312- -
After the page is changed to UTF8, remove 'UTF-8' and it will be three bytes per Chinese character. Thank you for your help
A character set can contain characters down to the number of bytes,
For example gbk can contain double-byte characters and single-byte characters.
utf8 can contain characters of 6, 5, 4, 3, 2, 1 (the old one is 4, 3, 2, 1) bytes.
Because it is impossible to determine whether a string contains multi-byte or single-byte characters.

GBK environment: $string = "Test ab test cd word 0 string";

$mystring=mb_substr($string,0,3,'GBK');

echo $mystring;


Copy the code. This 3 returns the number of characters. The setting is the number of characters and not the number of bytes.

Sigh~, what can I say? I see you have posted so many posts about multi-byte character encoding,
In the end, I still don’t understand the relationship between characters, bytes and character encoding-.-

[ ]
Reference answer
Crash -. -
Reference answer
Thank you, I feel enlightened...

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/632596.htmlTechArticleProblem mb_substr Solution $string = "People's Republic of China"; $mystring=mb_substr($string,0,6 ,'UTF-8'); echo $mystring; Copy the code as seen in the book: UTF-8 encoding below, one in...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn