Home >Backend Development >PHP Tutorial >PHP function to calculate the length of Chinese string and intercept Chinese string

PHP function to calculate the length of Chinese string and intercept Chinese string

WBOY
WBOYOriginal
2016-07-25 09:04:17874browse
  1. header('Content-type:text/html;charset=utf-8');
  2. /**
  3. * Function to count the length of Chinese strings
  4. * @param $str The string to calculate the length
  5. * @param $type Calculation length type, 0 (default) means one Chinese character is counted as one character, 1 means one Chinese character is counted as two characters
  6. * @http://bbs.it-home.org
  7. *
  8. */
  9. function abslength($str)
  10. {
  11. if( empty($str)){
  12. return 0;
  13. }
  14. if(function_exists('mb_strlen')){
  15. return mb_strlen($str,'utf-8');
  16. }
  17. else {
  18. preg_match_all("/./ u", $str, $ar);
  19. return count($ar[0]);
  20. }
  21. }
  22. $str = 'Script Academy welcomes everyone, ye! ';
  23. $len = abslength($str);
  24. var_dump($len); //return 12
  25. $len = abslength($str,'1');
  26. echo '
    '.$len ; //return 22
  27. /*
  28. utf-8 encoding to intercept Chinese strings, the parameters can refer to the substr function
  29. @param $str The string to be intercepted
  30. @param $start The starting position to be intercepted, negative numbers are inverse To intercept
  31. @param $end The length to be intercepted
  32. */
  33. function utf8_substr($str,$start=0) {
  34. if(empty($str)){
  35. return false;
  36. }
  37. if (function_exists(' mb_substr')){
  38. if(func_num_args() >= 3) {
  39. $end = func_get_arg(2);
  40. return mb_substr($str,$start,$end,'utf-8');
  41. }
  42. else {
  43. mb_internal_encoding("UTF-8");
  44. return mb_substr($str,$start);
  45. }
  46. }
  47. else {
  48. $null = "";
  49. preg_match_all("/./u", $str, $ ar);
  50. if(func_num_args() >= 3) {
  51. $end = func_get_arg(2);
  52. return join($null, array_slice($ar[0],$start,$end));
  53. }
  54. else {
  55. return join($null, array_slice($ar[0],$start));
  56. }
  57. }
  58. }
  59. $str2 = 'wo wants to intercept zhongwen';
  60. echo '
    ';
  61. echo utf8_substr($str2,0,-4); //return wo want to intercept zhon
  62. ?>
Copy code

2. Support gb2312, gbk, utf-8, big5 Chinese interception method

  1. /*
  2. * Chinese interception, supports gb2312, gbk, utf-8, big5
  3. * bbs.it-home.org
  4. * @param string $str The string to be intercepted
  5. * @param int $start interception starting position
  6. * @param int $length interception length
  7. * @param string $charset utf-8|gb2312|gbk|big5 encoding
  8. * @param $suffix whether to add a suffix
  9. */
  10. public function csubstr($str, $start=0, $length, $charset="utf-8", $suffix=true)
  11. {
  12. if(function_exists("mb_substr"))
  13. {
  14. if(mb_strlen($str , $charset) <= $length) return $str;
  15. $slice = mb_substr($str, $start, $length, $charset);
  16. }
  17. else
  18. {
  19. $re['utf-8'] = "/[x01-x7f]|[xc2-xdf][x80-xbf]|[xe0-xef][x80-xbf]{2}|[xf0-xff][x80-xbf]{3}/";
  20. $re['gb2312'] = "/[x01-x7f]|[xb0-xf7][xa0-xfe]/";
  21. $re['gbk'] = "/[x01-x7f]|[x81-xfe ][x40-xfe]/";
  22. $re['big5'] = "/[x01-x7f]|[x81-xfe]([x40-x7e]|xa1-xfe])/";
  23. preg_match_all($ re[$charset], $str, $match);
  24. if(count($match[0]) <= $length) return $str;
  25. $slice = join("",array_slice($match[0] , $start, $length));
  26. }
  27. if($suffix) return $slice."…";
  28. return $slice;
  29. }
  30. ?>
Copy code


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn