Home  >  Article  >  Backend Development  >  Problem with php smarty intercepting garbled Chinese characters? gb2312/utf-8_PHP tutorial

Problem with php smarty intercepting garbled Chinese characters? gb2312/utf-8_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:23:50673browse

The display of general website pages will inevitably involve the interception of substrings. At this time, truncate comes in handy, but it is only suitable for English users. For Chinese users, using truncate will cause garbled characters, and for Chinese and English For mixed strings, if the same number of strings are intercepted, the actual display lengths will be different, which will appear visually uneven and the image will be beautiful. This is because the length of one Chinese character is roughly equivalent to the length of two English characters. In addition, truncate is not compatible with GB2312, UTF-8 and other encodings at the same time.
Improved smartTruncate: File name: modifier.smartTruncate.php

Copy code The code is as follows:

function smartDetectUTF8($string)
{
static $result = array();
if(! array_key_exists($key = md5($string), $result))
{
$utf8 = "
/^(?:
[x09x0Ax0Dx20-x7E] # ASCII
| [xC2-xDF][x80-xBF] # non-overlong 2-byte
| xE0[xA0-xBF][x80-xBF] # excluding overlongs
| [xE1-xECxEExEF][x80-xBF]{2} # straight 3-byte
| xED[x80-x9F][x80-xBF ] # excluding surrogates
| xF0[x90-xBF][x80-xBF]{2} # planes 1-3
| [xF1-xF3][x80-xBF]{3} # planes 4-15
| xF4[x80-x8F][x80-xBF]{2} # plane 16
)+$/xs " , $string);
}
return $result[$key];
}
function smartStrlen($string)
{
$result = 0;
$number = smartDetectUTF8($string) ? 3 : 2;
for($i = 0; $i < strlen($string); $i += $bytes)
{
$bytes = ord( substr($string, $i, 1)) > 127 ? $number : 1;
$result += $bytes > 1 ? 1.0 : 0.5;
}
return $result;
}
function smartSubstr($string, $start, $length = null)
{
$result = '';
$number = smartDetectUTF8($string) ? 3 : 2;
if($start < 0)
{
$start = max(smartStrlen($string) + $start, 0);
}
for($i = 0; $i < ; strlen($string); $i += $bytes)
{
if($start <= 0)
{
break;
}
$bytes = ord (substr($string, $i, 1)) > 127 ? $number : 1;
$start -= $bytes > 1 ? 1.0 : 0.5;
}
if(is_null($ length))
{
$result = substr($string, $i);
}
else
{
for($j = $i; $j < strlen ($string); $j += $bytes)
{
if($length <= 0)
{
break;
}
if(($bytes = ord(substr($string, $j, 1)) > 127 ? $number : 1) > 1)
{
if($length < 1.0)
{
break;
}
$result .= substr($string, $j, $bytes);
$length -= 1.0;
}
else
{
$result . = substr($string, $j, 1);
$length -= 0.5;
}
}
}
return $result;
}
function smarty_modifier_smartTruncate( $string, $length = 80, $etc = '...',
$break_words = false, $middle = false)
{
if ($length == 0)
return ' ';
if (smartStrlen($string) > $length) {
$length -= smartStrlen($etc);
if (!$break_words && !$middle) {
$string = preg_replace('/s+?(S+)?$/', '', smartSubstr($string, 0, $length+1));
}
if(!$middle) {
return smartSubstr($string, 0, $length).$etc;
} else {
return smartSubstr($string, 0, $length/2) . $etc . smartSubstr($string, -$length/2 );
}
} else {
return $string;
}
}
?>


The above code fully implements the original function of truncate It has functions and is compatible with both GB2312 and UTF-8 encoding. When judging the character length, a Chinese character is counted as 1.0 and an English character is counted as 0.5, so there will be no unevenness when intercepting substrings.
There is nothing special about how to use the plug-in. Here is a simple test:
{$content|smartTruncate:5:".."} ($content is equal to "A China B China C People's D People's Republic of China F and G country H")
Display: A Chinese B Chinese C.. (The length of Chinese symbols is counted as 1.0, the length of English symbols is counted as 0.5, and the length of omitted symbols is considered)
No matter you are using GB2312 encoding or UTF-8 Coding, you will find that the results are correct, which is one of the reasons why I added the word smart in the plug-in name.

http://www.bkjia.com/PHPjc/324428.html

truehttp: //www.bkjia.com/PHPjc/324428.htmlTechArticleThe display of general website pages will inevitably involve the interception of substrings. At this time, truncate comes in handy. , but it is only suitable for English users. For Chinese users, use...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn