Home  >  Article  >  Backend Development  >  PHP can easily intercept mixed Chinese and English strings with just 2 lines of code! _PHP Tutorial

PHP can easily intercept mixed Chinese and English strings with just 2 lines of code! _PHP Tutorial

WBOY
WBOYOriginal
2016-07-13 10:29:19757browse

When it comes to Chinese and English mixed counting and interception, the first thing that comes to mind is ascii, hexadecimal, regular matching, and loop counting.

Today I will share with you the mb extension of php to teach you how to process strings easily.


Let me introduce to you the functions used:

mb_strwidth($str, $encoding) returns the width of the string

$str The string to be calculated

$encoding The encoding to use, such as utf8, gbk

mb_strimwidth($str, $start, $width, $tail, $encoding) intercepts string according to width

$str The string to be intercepted

$start From which position to intercept, the default is 0

$width The width to be intercepted

$tail is appended to the string after the intercepted string, commonly used is...

$encoding The encoding to use


I will give you an example below:

<?<span>php
</span><span>/*</span><span>*
 * utf8 编码格式
 * 1个中文占用3个字节
 * 我们希望的是1个中文占用2个字节,
 * 因为从宽度上看2个英文字母占用的位置相当于1个中文
 </span><span>*/</span>

<span>//</span><span> 测试字符串</span>
<span>$str</span> = 'aaaa啊啊aaaa啊啊啊aaa'<span>;
</span><span>echo</span> <span>strlen</span>(<span>$str</span>); <span>//</span><span> 只用strlen输出为25个字节

// 必须指定编码,不然会使用php的内码 mb_internal_encoding()可以查看内码
// 使用mb_strwidth输出字符串的宽度为20使用utf8编码</span>
<span>echo</span> mb_strwidth(<span>$str</span>, 'utf8'<span>); 

</span><span>//</span><span> 只有宽度大于10才截取</span>
<span>if</span>(mb_strwidth(<span>$str</span>, 'utf8')>10<span>){
    </span><span>//</span><span> 此处设定从0开始截取,取10个追加...,使用utf8编码
    // 注意追加的...也会被计算到长度之内</span>
    <span>$str</span> = mb_strimwidth(<span>$str</span>, 0, 10, '...', 'utf8'<span>);
}

</span><span>//</span><span> 最后输出 aaaa啊... 4个a算4个 1个啊算2个 3个点算3个 4+2+3=9
// 是不是很简单啊,有的人说了为什么是9个不是10个吗?
// 因为正好&ldquo;啊&rdquo;的后边还是&ldquo;啊&rdquo;,中文算2个,9+2=11 超出了设定,所以去掉1个就是9了</span>
<span>echo</span> <span>$str</span>;


Let me introduce some other functions to you:

mb_strlen($str, $encoding) returns the length of the string

$str The string to be calculated

$encoding The encoding used

mb_substr($str, $start, $length, $encoding) intercepts the string

$str The string to be intercepted

$start Where to start intercepting

$length intercepts the length

$encoding The encoding used

In fact, these two functions are very similar to strlen() and substr(). The only difference is that the encoding can be set.


Example below:

<?<span>php
</span><span>/*</span><span>*
 * utf8 编码格式
 * 1个中文占用3个字节
 </span><span>*/</span>
<span>$str</span> = 'aa12啊aa'<span>;
</span><span>echo</span> <span>strlen</span>(<span>$str</span>); <span>//</span><span> 直接输出长度为9

// 输出长度为7,为什么是7呢?
// 注意这里设定编码以后,不管是中文还是英文每个长度都为1
// a a 1 2 啊 a a 
// 1+1+1+1+1+1+1 = 7
// 是不是正好7个字符啊</span>
<span>echo</span> mb_strlen(<span>$str</span>, 'utf8'<span>);

</span><span>//</span><span> 同样mb_substr也是一样的
// 我现在只想要5个字符</span>
<span>echo</span> mb_substr(<span>$str</span>, 0, 5, 'utf8'); <span>//</span><span> 输出 aa12啊</span>


In fact, there are many useful functions in the mb extension, so I won’t list them all here.

Interested friends can view the official manual

http://www.php.net/manual/zh/ref.mbstring.php

Okay, that’s all for today.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/777158.htmlTechArticleWhen it comes to Chinese and English mixed counting and interception, the first thing that comes to mind is ascii, hexadecimal, regular matching, Cycle count. Today I will share with you the mb extension of php and teach you how to handle it easily...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn