Home >Backend Development >PHP Tutorial >How to Safely Truncate UTF-8 Strings in PHP While Preserving Word Boundaries?

How to Safely Truncate UTF-8 Strings in PHP While Preserving Word Boundaries?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-18 16:23:11366browse

How to Safely Truncate UTF-8 Strings in PHP While Preserving Word Boundaries?

Truncating Strings with UTF-8 Characters

Problem:
Truncating multibyte strings to a specified character limit while preserving word boundaries can be a challenge in PHP. This issue involves achieving this functionality with a custom method named truncate() that should behave consistently with multibyte characters.

Steps to Resolve:

  1. Determine the maximum character length by deducting the length of the termination string from the desired maximum length.
  2. Ensure the string's length exceeds the maximum length; return it unchanged if it doesn't.
  3. Identify the last whitespace character below the maximum length to establish the word boundary.
  4. Truncate the string at the last whitespace or the maximum length if no whitespace exists.
  5. Append the termination string to the truncated string.
  6. Return the modified string.

Solution Using mb_strimwidth():

PHP provides the mb_strimwidth() function, which can handle multibyte string truncation. This function does not, however, obey word boundaries. The following code snippet demonstrates its usage:

public function truncate($string, $chars = 50, $terminator = ' …')
{
    $maxChars = $chars - strlen($terminator);
    if (mb_strlen($string) <= $maxChars) {
        return $string;
    }

    $lastWhitespace = mb_strrpos(mb_substr($string, 0, $maxChars), ' ');
    if ($lastWhitespace !== false) {
        return mb_substr($string, 0, $lastWhitespace) . $terminator;
    } else {
        return mb_substr($string, 0, $maxChars) . $terminator;
    }
}

The above is the detailed content of How to Safely Truncate UTF-8 Strings in PHP While Preserving Word Boundaries?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn