Home  >  Article  >  Backend Development  >  (Practical) Function code for calculating the length of Chinese strings and intercepting Chinese strings in PHP

(Practical) Function code for calculating the length of Chinese strings and intercepting Chinese strings in PHP

黄舟
黄舟Original
2017-02-06 15:14:561539browse

In PHP, we all know that there are special mb_substr and mb_strlen functions that can intercept and calculate the length of Chinese. However, since these functions are not the core functions of PHP, they often may not be turned on. Of course, if you are using your own server, you only need to enable it in php.ini. If a virtual host is used and the server does not enable this function, then we need to write some functions suitable for ourselves.

The following functions are quite easy to use. But you need to know that it must be used in a utf-8 environment.

header('Content-type:text/html;charset=utf-8'); 
/** 
* 可以统计中文字符串长度的函数 
* @param $str 要计算长度的字符串 
* @param $type 计算长度类型,0(默认)表示一个中文算一个字符,1表示一个中文算两个字符 
* 
*/ 
function abslength($str) 
{ 
if(empty($str)){ 
return 0; 
} 
if(function_exists('mb_strlen')){ 
return mb_strlen($str,'utf-8'); 
} 
else { 
preg_match_all("/./u", $str, $ar); 
return count($ar[0]); 
} 
} 
$str = '我们都是中国人啊,ye!'; 
$len = abslength($str); 
var_dump($len); //return 12 
$len = abslength($str,'1'); 
echo &#39;<br />&#39;.$len; //return 22 
/* 
utf-8编码下截取中文字符串,参数可以参照substr函数 
@param $str 要进行截取的字符串 
@param $start 要进行截取的开始位置,负数为反向截取 
@param $end 要进行截取的长度 
*/ 
function utf8_substr($str,$start=0) { 
if(empty($str)){ 
return false; 
} 
if (function_exists(&#39;mb_substr&#39;)){ 
if(func_num_args() >= 3) { 
$end = func_get_arg(2); 
return mb_substr($str,$start,$end,&#39;utf-8&#39;); 
} 
else { 
mb_internal_encoding("UTF-8"); 
return mb_substr($str,$start); 
} 
} 
else { 
$null = ""; 
preg_match_all("/./u", $str, $ar); 
if(func_num_args() >= 3) { 
$end = func_get_arg(2); 
return join($null, array_slice($ar[0],$start,$end)); 
} 
else { 
return join($null, array_slice($ar[0],$start)); 
} 
} 
} 
$str2 = &#39;wo要截取zhongwen&#39;; 
echo &#39;<br />&#39;; 
echo utf8_substr($str2,0,-4); //return wo要截取zhon

Supports gb2312, gbk, utf-8, big5 Chinese interception method

<?php
/* 
* 中文截取,支持gb2312,gbk,utf-8,big5 
* 
* @param string $str 要截取的字串 
* @param int $start 截取起始位置 
* @param int $length 截取长度 
* @param string $charset utf-8|gb2312|gbk|big5 编码 
* @param $suffix 是否加尾缀 
*/ 
public function csubstr($str, $start=0, $length, $charset="utf-8", $suffix=true) 
{ 
if(function_exists("mb_substr")) 
{ 
if(mb_strlen($str, $charset) <= $length) return $str; 
$slice = mb_substr($str, $start, $length, $charset); 
} 
else 
{ 
$re[&#39;utf-8&#39;] = "/[\x01-\x7f]|[\xc2-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf]{2}|[\xf0-\xff][\x80-\xbf]{3}/"; 
$re[&#39;gb2312&#39;] = "/[\x01-\x7f]|[\xb0-\xf7][\xa0-\xfe]/"; 
$re[&#39;gbk&#39;] = "/[\x01-\x7f]|[\x81-\xfe][\x40-\xfe]/"; 
$re[&#39;big5&#39;] = "/[\x01-\x7f]|[\x81-\xfe]([\x40-\x7e]|\xa1-\xfe])/"; 
preg_match_all($re[$charset], $str, $match); 
if(count($match[0]) <= $length) return $str; 
$slice = join("",array_slice($match[0], $start, $length)); 
} 
if($suffix) return $slice."…"; 
return $slice; 
}

The above is the (practical) content of the function code for calculating the length of Chinese strings and intercepting Chinese strings in PHP , for more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn