Home  >  Article  >  Backend Development  >  Detailed explanation of PHP operation strings

Detailed explanation of PHP operation strings

韦小宝
韦小宝Original
2018-03-08 16:04:152368browse

There are eight data types in PHP, and string is also one of them, and it is also the most common and commonly used one. Usually we need to PHP string performs some operations. Today we will take a look at how to intercept PHP strings in this article!

1.mb_strimwidthString interceptionFunction usage & UTF8 encoding understanding

<?php  
header("Content-type:text/html;charset=utf-8");  

echo mb_strwidth("6", "UTF-8") . &#39;<br />&#39;;//1  
echo mb_strwidth("A", "UTF-8") . &#39;<br />&#39;;//1  
echo mb_strwidth("a", "UTF-8") . &#39;<br />&#39;;//1  
echo mb_strwidth("月", "UTF-8") . &#39;<br />&#39;;//2  

echo mb_strwidth("6月9日OUR系统升级通知", "UTF-8") . &#39;<br />&#39;;//21  

echo mb_strimwidth("6月9日OUR系统升级通知", 0, 10, &#39;...&#39;) .&#39;<br />&#39;;//6月9��...  
echo mb_strimwidth("6月9日OUR系统升级通知", 0, 10, &#39;...&#39;, "UTF-8") .&#39;<br />&#39;;//6月9日O...  
echo mb_strimwidth("6月9日OUR系统升级通知", 0, 10, &#39;......&#39;, "UTF-8") .&#39;<br />&#39;;//6月9......  
echo mb_strimwidth("6月9日OUR系统升级通知", 0, 10, &#39;&#39;, "UTF-8") .&#39;<br />&#39;;//6月9日OUR  

?>

Attached manual introduction:

mb_strimwidth
(PHP 4 >= 4.0.6, PHP 5)

mb_strimwidth — Get truncated string with specified width

Description
string mb_strimwidth ( stringstr,intstr,intstart , intwidth[,stringwidth[,stringtrimmarker [, string$encoding ]] )
Truncates stringstr to specified width.

This experience

1. This function requires the system to load the mb (multi-byte) extension

2. The $trimmarker parameter will affect the result, and its length is also included in the result

3. This function calculates the width occupied by a string in a fixed-width font. Chinese characters occupy two widths, and the rest occupy one width.

4. It is good to pass in encoding parameters

2.

mb_substr($string,&#39;GBK&#39;),
mb_strlen($string,&#39;GBK&#39;),
mb_strwidth($string,&#39;GBK&#39;) 
适用于GBK

3.

mb_substr($string,&#39;GBK&#39;),mb_strlen($string,&#39;GBK&#39;),mb_strwidth($string,&#39;GBK&#39;)适用于GBK
$test = "123中文测试";

//字符数
$mb_strlen($test, &#39;GBK&#39;);//7
$mb_strlen($test, &#39;UTF-8&#39;);//7

$mb_strlen($test);//11

//字节数
$mb_strwidth($test, &#39;GBK&#39;);//11
$mb_strwidth($test, &#39;UTF-8&#39;);//4
$mb_strwidth($test);//11

mb_substr($test, 0, 4);//乱码
mb_substr($test, 0, 5);//123中


//字符数

mb_substr($test, 0, 4, &#39;GBK&#39;);//123中
mb_substr($test, 0, 5, &#39;GBK&#39;);//123中文

mb_substr($test, 0, 4, &#39;gb2312&#39;);//123中
mb_substr($test, 0, 4, &#39;UTF-8&#39;);//乱码

//截取中文英文字符串:
//方法1:

function str_cut($str, $len)
{
    $str = iconv($str, &#39;GBK&#39;, &#39;GBK/TRANLIT&#39;);
    if (mb_strwidth($str) < $len) {
        return $str;
    }
    for ($i = 0; $i < mb_strlen($str); $i++) {
        $tmp = mb_substr($str, $i, 1, &#39;GBK&#39;);
        if (mb_strwidth($return . $tmp) > $len) {
            break;
        }
        $return .= $tmp;
    }
    return $return;
}

// 方法2:

function str_cut($str, $len)
{
    $str = iconv($str, &#39;GBK&#39;, &#39;GBK/TRANLIT&#39;);
    if (mb_strwidth($str) < $len) {
        return $str;
    }
    for ($i = 0; $i < mb_strlen($str); $i++) {
        $return = mb_substr($str, 0, $i, &#39;GBK&#39;);
        if (mb_strwidth($return) > $len) {

            $return = mb_substr($str, 0, $i - 1, &#39;GBK&#39;);
            break;
        }
    }
    return $return;
}

/*
判断中文和编码有关 gbk是双字节,utf8是三字节,可以根据 中文的范围来判断

编码范围1. GBK (GB2312/GB18030)
x00-xff GBK双字节编码范围
x20-x7f ASCII
xa1-xff 中文
x80-xff 中文
2. UTF-8 (Unicode)
u4e00-u9fa5 (中文)
x3130-x318F (韩文
xAC00-xD7A3 (韩文)
u0800-u4e00 (日文)
ps: 韩文是大于[u9fa5]的字符
*/
//二、代码例子

//截取字符串字串-GBK (PHP)
function gb_substr($str, $len)
{
    $count = 0;
    for ($i = 0; $i < strlen($str); $i++) {
        if ($count == $len) break;
        if (preg_match("/[x80-xff]/", substr($str, $i, 1))) ++$i;
        ++$count;
    }
    return substr($str, 0, $i);
}

function substrGb($str, $len)
{
    $ret = &#39;&#39;;
    $i = 0;
    while ($i < $len) {
        $ch = substr($str, $i, 1);
        if (ord($ch) > 0x80) {
            $i++;
        }
        $i++;
    }
    $ret = substr($str, 0, $i);
    return $ret;
}

//截取字符串-UTF8(PHP)
function utf8_substr($str, $position, $length)
{
    $start_position = strlen($str);
    $start_byte = 0;
    $end_position = strlen($str);
    $count = 0;
    for ($i = 0; $i < strlen($str); $i++) {
        if ($count >= $position && $start_position > $i) {
            $start_position = $i;
            $start_byte = $count;
        }
        if (($count - $start_byte) >= $length) {
            $end_position = $i;
            break;
        }
        $value = ord($str[$i]);
        if ($value > 127) {
            $count++;
            if ($value >= 192 && $value <= 223) $i++;
            elseif ($value >= 224 && $value <= 239) $i = $i + 2;
            elseif ($value >= 240 && $value <= 247) $i = $i + 3;
            else die(&#39;Not a UTF-8 compatible string&#39;);
        }
        $count++;
    }
    return (substr($str, $start_position, $end_position - $start_position));
}

// int ord ( string string )------返回字符的ASCII码
// string chr ( int ascii )-----根据字符的ASCII码返回相应的字符

That’s all for PHP string operations. You can test it to find out!

Related recommendations:

Detailed explanation of php string usage

The above is the detailed content of Detailed explanation of PHP operation strings. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn