Home > Article > Backend Development > PHP character interception function, compatible with various gbk, utf-8 encodings
The character interception function substr in PHP can only intercept the whole English without garbled characters. If there are Chinese characters in it, it will definitely not be intercepted. Let me introduce two Compatible with various gbk, utf-8 encodingsString interceptionfunction
Example 1
function CsubStrPro($str, $start, $length, $charset = "utf-8", $suffix = false) { if (function_exists ( "mb_substr" )) return mb_substr ( $str, $start, $length, $charset ); $re ['utf-8'] = "/[x01-x7f]|[xc2-xdf][x80-xbf]|[xe0-xef][x80-xbf]{2}|[xf0-xff][x80-xbf]{3}/"; $re ['gb2312'] = "/[x01-x7f]|[xb0-xf7][xa0-xfe]/"; $re ['gbk'] = "/[x01-x7f]|[x81-xfe][x40-xfe]/"; $re ['big5'] = "/[x01-x7f]|[x81-xfe]([x40-x7e]|xa1-xfe])/"; preg_match_all ( $re [$charset], $str, $match ); $slice = join ( "", array_slice ( $match [0], $start, $length ) ); if ($suffix) return $slice . "…"; return $slice; }
Example 2
function subString_UTF8($str, $start, $lenth) { $len = strlen($str); $r = array(); $n = 0; $m = 0; for($i = 0; $i < $len; $i++) { $x = substr($str, $i, 1); $a = base_convert(ord($x), 10, 2); $a = substr('00000000'.$a, -8); if ($n < $start){ if (substr($a, 0, 1) == 0) { }elseif (substr($a, 0, 3) == 110) { $i += 1; }elseif (substr($a, 0, 4) == 1110) { $i += 2; } $n++; }else{ if (substr($a, 0, 1) == 0) { $r[ ] = substr($str, $i, 1); }elseif (substr($a, 0, 3) == 110) { $r[ ] = substr($str, $i, 2); $i += 1; }elseif (substr($a, 0, 4) == 1110) { $r[ ] = substr($str, $i, 3); $i += 2; }else{ $r[ ] = ''; } if (++$m >= $lenth){ break; } } } return $r; } // End subString_UTF8; }// End String
#Since this function returns an array, it is necessary to cooperate with the join function to display the string: Example 2
#join('',subString_UTF8($str, $start, $lenth));
#When the page is displayed You can also follow this statement with a "..."
The above is the PHP character interception function, which is compatible with various gbk, utf-8 encoded content. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!