Home >Backend Development >PHP Tutorial >How Can I Safely Truncate Multibyte Strings in PHP While Preserving Word Boundaries?

How Can I Safely Truncate Multibyte Strings in PHP While Preserving Word Boundaries?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-04 05:41:09618browse

How Can I Safely Truncate Multibyte Strings in PHP While Preserving Word Boundaries?

Truncating Multibyte Strings in PHP

In PHP, truncating multibyte strings can be a complex task. This article addresses the challenge of truncating such strings to a specified number of characters, considering both multibyte character encoding and word boundaries.

To achieve this, one approach involves using PHP's built-in mb_strimwidth() function, which allows for the truncation of strings with specified widths. However, this function does not take into account word boundaries.

Custom Implementation for Truncation

A custom implementation can be created to handle both multibyte character encoding and word boundaries:

  1. Calculate Truncation Length: Subtract the length of the terminator string from the maximum number of characters to truncate.
  2. Validate String Length: Check if the input string is longer than the calculated truncation length; otherwise, return it unaltered.
  3. Find Word Boundary: Use mb_strrpos() to search for the last space character in the string below the truncation length.
  4. Cut String: If a word boundary is found, truncate the string at that point; otherwise, truncate at the calculated truncation length.
  5. Append Terminator: Add the terminator string to the truncated string.
  6. Return Truncated String: Output the truncated and appended string.

Example Usage:

function truncate($string, $chars = 50, $terminator = ' …') {
  // Calculate truncation length
  $trunc_len = $chars - strlen($terminator);

  // Validate string length
  if (strlen($string) <= $trunc_len) {
    return $string;
  }

  // Find word boundary
  $space_pos = mb_strrpos($string, ' ', -$trunc_len);

  // Cut string
  if ($space_pos !== false) {
    $truncated_string = mb_substr($string, 0, $space_pos);
  } else {
    $truncated_string = mb_strimwidth($string, 0, $trunc_len);
  }

  // Append terminator
  return $truncated_string . $terminator;
}

This function can be used to truncate multibyte strings, considering both character encoding and word boundaries. It provides a straightforward and robust solution for this common PHP task.

The above is the detailed content of How Can I Safely Truncate Multibyte Strings in PHP While Preserving Word Boundaries?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn