Home  >  Article  >  Backend Development  >  Sharing of high-performance Chinese string interception functions in PHP_PHP tutorial

Sharing of high-performance Chinese string interception functions in PHP_PHP tutorial

WBOY
WBOYOriginal
2016-07-20 11:10:27733browse

This uses a string interception function in phpwind. It supports character interception in multiple encodings such as gbk, gbk2312, utf-8, etc. It also supports the processing of Chinese strings very efficiently.

The code is as follows
 代码如下 复制代码


function substrs($content,$length,$add='Y'){
if (strlen($content)>$length) {
if ($GLOBALS['db_charset']!='utf-8') {
$retstr = '';
for ($i=0;$i<$length-2;$i++) {
$retstr .= ord($content[$i]) > 127 ? $content[$i].$content[++$i] : $content[$i];
}
return $retstr.($add=='Y' ? ' ..' : '');
}
return utf8_trim(substr($content,0,$length)).($add=='Y' ? ' ..' : '');
}
return $content;
}
function utf8_trim($str) {
$hex = '';
$len = strlen($str)-1;
for ($i=$len;$i>=0;$i-=1) {
$ch = ord($str[$i]);
$hex .= " $ch";
if (($ch & 128)==0 || ($ch & 192)==192) {
return substr($str,0,$i);
}
}
return $str.$hex;
}

function cutstr($string, $length, $dot = ' ...') {
global $charset;
if(strlen($string) <= $length) {
return $string;
}
$string = str_replace(array('&', '"', '<', '>'), array('&', '"', '<', '>'), $string);
$strcut = '';
if(strtolower($charset) == 'utf-8') {
$n = $tn = $noc = 0;
while($n < strlen($string)) {
$t = ord($string[$n]);
if($t == 9 || $t == 10 || (32 <= $t && $t <= 126)) {
$tn = 1; $n++; $noc++;
} elseif(194 <= $t && $t <= 223) {
$tn = 2; $n += 2; $noc += 2;
} elseif(224 <= $t && $t <= 239) {
$tn = 3; $n += 3; $noc += 2;
} elseif(240 <= $t && $t <= 247) {
$tn = 4; $n += 4; $noc += 2;
} elseif(248 <= $t && $t <= 251) {
$tn = 5; $n += 5; $noc += 2;
} elseif($t == 252 || $t == 253) {
$tn = 6; $n += 6; $noc += 2;
} else {
$n++;
}
if($noc >= $length) {
break;
}
}
if($noc > $length) {
$n -= $tn;
}
$strcut = substr($string, 0, $n);
} else {
for($i = 0; $i < $length; $i++) {
$strcut .= ord($string[$i]) > 127 ? $string[$i].$string[++$i] : $string[$i];
}
}
$strcut = str_replace(array('&', '"', '<', '>'), array('&', '"', '<', '>'), $strcut);
return $strcut.$dot;
}

Copy code

function substrs($content,$length,$add='Y'){
if (strlen($content)>$length) {
if ($GLOBALS ['db_charset']!='utf-8') {
$retstr = '';
for ($i=0;$i<$length-2;$i++) {
$retstr .= ord($content[$i]) > 127 ? $content[$i].$content[++$i] : $content[$i];
}
return $retstr.($add=='Y' ? ' ..' : '');
}
return utf8_trim(substr($content,0,$length)).($add= ='Y' ? ' ..' : '');
}
return $content;
}
function utf8_trim($str) {
$hex = '';
$len = strlen($str)-1;
for ($i=$len;$i>=0;$i-=1) {
$ch = ord($str[$i]);
$hex .= " $ch";
if (($ch & 128)==0 || ($ch & 192)==192) {
return substr($str,0,$i);
}
}
return $str.$hex;
}
function cutstr($string, $length, $dot = ' ...') {
global $charset;
if(strlen($string) <= $length) {
return $ string;
}
$string = str_replace(array('&', '"', '<', '>'), array('&', '"', '< ', '>'), $string);
$strcut = '';
if(strtolower($charset) == 'utf-8') {
$n = $ tn = $noc = 0;
while($n < strlen($string)) {
$t = ord($string[$n]);
if($t = = 9 || $t == 10 || (32 <= $t && $t <= 126)) {
$tn = 1; $n++; $noc++;
} elseif( 194 <= $t && $t <= 223) {
$tn = 2; $n += 2; $noc += 2;
} elseif(224 <= $t && $t <= 239) {
$tn = 3; $n += 3; $noc += 2;
} elseif(240 <= $t && $t <= 247) {
$tn = 4; $n += 4; $noc += 2;
} elseif(248 <= $t && $t <= 251) {
$tn = 5; $n += 5; $noc += 2;
} elseif($t == 252 || $t == 253) {
$tn = 6; $n += 6 ; $noc += 2;
} else {
$n++;
}
if($noc >= $length) {
break;
}
}
if($noc > $length) {
$n -= $tn;
}
$strcut = substr($string, 0 , $n);
} else {
for($i = 0; $i < $length; $i++) {
$strcut .= ord($string[$i] ) > 127 ? $string[$i].$string[++$i] : $string[$i];
}
}
$strcut = str_replace(array(' &', '"', '<', '>'), array('&', '"', '<', '>'), $strcut);
return $strcut .$dot;
}

http://www.bkjia.com/PHPjc/444721.htmlwww.bkjia.com
true
http: //www.bkjia.com/PHPjc/444721.html
TechArticleThis is a string interception function in phpwind, which supports gbk, gbk2312, utf-8, etc. This coded character interception also supports the processing of Chinese strings very efficiently. The code is as follows...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn