Home  >  Article  >  Backend Development  >  PHP English string interception code (to ensure word integrity)

PHP English string interception code (to ensure word integrity)

WBOY
WBOYOriginal
2016-07-25 08:53:041144browse
  1. /**
  2. * Complete word interception
  3. * Edited and organized by bbs.it-home.org
  4. * @param $str
  5. * @param $start
  6. * @param $length
  7. *
  8. * @return string
  9. */
  10. public static function usubstr($str, $start, $length = null)
  11. {
  12. // Intercept normally first.
  13. $res = substr($str, $start, $length);
  14. $strlen = strlen($str);
  15. /* Then determine whether the first and last 6 bytes are complete (not incomplete) */
  16. // If the parameter start is a positive number
  17. if ($start >= 0) {
  18. //Truncate about 6 bytes forward
  19. $next_start = $start + $length; //Initial position
  20. $next_len = $next_start + 6 <= $strlen ? 6: $strlen - $next_start;
  21. $next_segm = substr($str, $next_start, $next_len);
  22. // If the first byte is not the first byte of the complete character, then intercept about 6 bytes
  23. $prev_start = $start - 6 > 0 ? $start - 6 : 0;
  24. $prev_segm = substr($str, $prev_start, $start - $prev_start);
  25. } // start is a negative number
  26. else {
  27. // Cut about 6 bytes forward
  28. $next_start = $strlen + $start + $length; // Initial position
  29. $next_len = $next_start + 6 <= $strlen ? 6 : $strlen - $next_start;
  30. $next_segm = substr($str, $next_start, $next_len);
  31. // If the first byte is not the first byte of the complete character, then intercept about 6 bytes later.
  32. $start = $strlen + $start;
  33. $prev_start = $start - 6 > 0 ? $start - 6 : 0;
  34. $prev_segm = substr($str, $prev_start, $start - $prev_start);
  35. }
  36. // Determine whether the first 6 bytes match utf8 rules
  37. if (preg_match('@^([x80-xBF]{0,5})[xC0-xFD]?@', $next_segm, $bytes)) {
  38. if (!empty($bytes[1] )) {
  39. $bytes = $bytes[1];
  40. $res .= $bytes;
  41. }
  42. }
  43. // Determine whether the last 6 bytes comply with utf8 rules
  44. $ord0 = ord($res[0]);
  45. if (128 <= $ord0 && 191 >= $ord0) {
  46. // Intercept from the back and add it in front of res.
  47. if (preg_match('@[xC0-xFD][x80-xBF]{ 0,5}$@', $prev_segm, $bytes)) {
  48. if (!empty($bytes[0])) {
  49. $bytes = $bytes[0];
  50. $res = $bytes . $res;
  51. }
  52. }
  53. }
  54. if (strlen($res) < $strlen) {
  55. $res = $res . '...';
  56. }
  57. return $res;
  58. }
Copy code


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn