search
HomeBackend DevelopmentPHP ProblemHow to solve the problem of php trim garbled characters

php trim garbled code is because when executing rtrim, 0x81 will be removed, resulting in garbled code. The solution is to use the "mb_rtrim($tag, ",",$encoding)" method to solve the garbled code.

How to solve the problem of php trim garbled characters

The operating environment of this article: windows7 system, PHP7.1 version, DELL G3 computer

First run the following code:

$tag = "互联网产品、";
$text = rtrim($tag, "、");
print_r($text);

We may think that the result we will get is an Internet product, but the actual result is an Internet product. Why is this so?

Popular Science

All those using the mb_ prefix in PHP are multi-byte functions http://php.net/manual/zh/ref....

For example

$str = "abcd";
print_r(strlen($str).""); // 4
print_r(mb_strlen($str).""); // 4
$str = "周梦康";
print_r(strlen($str).""); // 9
print_r(mb_strlen($str).""); // 3

mb_ series functions operate based on the granularity of "one character composed of multiple bytes". Without mb_, they operate based on the actual number of bytes.

Principle

trim function documentation

string trim ( string $str [, string $character_mask = " " ] )

This function is not a multi-byte function, that is to say, multi-byte characters such as Chinese characters will have their heads or tails Use a single byte to match the char array corresponding to the subsequent $character_mask. If it is in the subsequent array, delete it and continue matching. For example:

echo ltrim("bcdf","abc"); // df

As shown in the function string_print_char in the demo below:

consists of three bytes 0xe3 0x80 0x81,

consists of three bytes 0xe5 0x93 0x81 composition.

So when executing rtrim, 0x81 will be removed through byte comparison, resulting in garbled characters in the end.

[Recommended study: "PHP Video Tutorial"]

Source code exploration

View the source code of PHP7, Then I extracted the following small demo to facilitate everyone to learn together. In fact, learning PHP source code is not difficult, and you can make a little progress every day.

//
// main.c
// trim
//
// Created by 周梦康 on 2017/10/18.
// Copyright © 2017年 周梦康. All rights reserved.
//
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void string_print_char(char *str);
void php_charmask(unsigned char *input, size_t len, char *mask);
char *ltrim(char *str,char *character_mask);
char *rtrim(char *str,char *character_mask);
int main(int argc, char const *argv[])
{
printf("%s",ltrim("bcdf","abc"));
string_print_char("品"); // e5 93 81
string_print_char("、"); // e3 80 81
printf("%s",rtrim("互联网产品、","、"));
return 0;
}
char *ltrim(char *str,char *character_mask)
{
char *res;
char mask[256];
register size_t i;
int trimmed = 0;
size_t len = strlen(str);
php_charmask((unsigned char*)character_mask, strlen(character_mask), mask);
for (i = 0; i < len; i++) {
if (mask[(unsigned char)str[i]]) {
trimmed++;
} else {
break;
}
}
len -= trimmed;
str += trimmed;
res = (char *) malloc(sizeof(char) * (len+1));
memcpy(res,str,len);
return res;
}
char *rtrim(char *str,char *character_mask)
{
char *res;
char mask[256];
register size_t i;
size_t len = strlen(str);
php_charmask((unsigned char*)character_mask, strlen(character_mask), mask);
if (len > 0) {
i = len - 1;
do {
if (mask[(unsigned char)str[i]]) {
len--;
} else {
break;
}
} while (i-- != 0);
}
res = (char *) malloc(sizeof(char) * (len+1));
memcpy(res,str,len);
return res;
}
void string_print_char(char *str)
{
unsigned long l = strlen(str);
for (int i=0; i < l; i++) {
printf("%02hhx ",str[i]);
}
printf("");
}
void php_charmask(unsigned char *input, size_t len, char *mask)
{
unsigned char *end;
unsigned char c;
memset(mask, 0, 256);
for (end = input+len; input < end; input++) {
c = *input;
mask[c]= 1;
}
}

If you feel that the demo is not clear enough, copy it and execute it yourself~

C Students with poor language foundation don’t need to worry, I will write a special PHP tutorial for beginners later. A series of short introductory articles on C language.

Solution

Then let’s follow the same pattern and use PHP’s own multi-byte functions to implement it:

function mb_rtrim($string, $trim, $encoding)
{
$mask = [];
$trimLength = mb_strlen($trim, $encoding);
for ($i = 0; $i < $trimLength; $i++) {
$item = mb_substr($trim, $i, 1, $encoding);
$mask[] = $item;
}
$len = mb_strlen($string, $encoding);
if ($len > 0) {
$i = $len - 1;
do {
$item = mb_substr($string, $i, 1, $encoding);
if (in_array($item, $mask)) {
$len--;
} else {
break;
}
} while ($i-- != 0);
}
return mb_substr($string, 0, $len, $encoding);
}
mb_internal_encoding("UTF-8");
$tag = "互联网产品、";
$encoding = mb_internal_encoding();
print_r(mb_rtrim($tag, "、",$encoding));

Of course you You can also use regular expressions. Through the above function learning, have you learned single-byte functions and multi-byte functions?

PHP7 related source code

PHP_FUNCTION(trim)
{
php_do_trim(INTERNAL_FUNCTION_PARAM_PASSTHRU, 3);
}
PHP_FUNCTION(rtrim)
{
php_do_trim(INTERNAL_FUNCTION_PARAM_PASSTHRU, 2);
}
PHP_FUNCTION(ltrim)
{
php_do_trim(INTERNAL_FUNCTION_PARAM_PASSTHRU, 1);
}
static void php_do_trim(INTERNAL_FUNCTION_PARAMETERS, int mode)
{
zend_string *str;
zend_string *what = NULL;
ZEND_PARSE_PARAMETERS_START(1, 2)
Z_PARAM_STR(str)
Z_PARAM_OPTIONAL
Z_PARAM_STR(what)
ZEND_PARSE_PARAMETERS_END();
ZVAL_STR(return_value, php_trim(str, (what ? ZSTR_VAL(what) : NULL), (what ? ZSTR_LEN(what) : 0), mode));
}
PHPAPI zend_string *php_trim(zend_string *str, char *what, size_t what_len, int mode)
{
const char *c = ZSTR_VAL(str);
size_t len = ZSTR_LEN(str);
register size_t i;
size_t trimmed = 0;
char mask[256];
if (what) {
if (what_len == 1) {
char p = *what;
if (mode & 1) {
for (i = 0; i < len; i++) {
if (c[i] == p) {
trimmed++;
} else {
break;
}
}
len -= trimmed;
c += trimmed;
}
if (mode & 2) {
if (len > 0) {
i = len - 1;
do {
if (c[i] == p) {
len--;
} else {
break;
}
} while (i-- != 0);
}
}
} else {
php_charmask((unsigned char*)what, what_len, mask);
if (mode & 1) {
for (i = 0; i < len; i++) {
if (mask[(unsigned char)c[i]]) {
trimmed++;
} else {
break;
}
}
len -= trimmed;
c += trimmed;
}
if (mode & 2) {
if (len > 0) {
i = len - 1;
do {
if (mask[(unsigned char)c[i]]) {
len--;
} else {
break;
}
} while (i-- != 0);
}
}
}
} else {
if (mode & 1) {
for (i = 0; i < len; i++) {
if ((unsigned char)c[i] <= &#39; &#39; &&
(c[i] == &#39; &#39; || c[i] == &#39;&#39; || c[i] == &#39;&#39; || c[i] == &#39; &#39; || c[i] == &#39;&#39; || c[i] == &#39;&#39;)) {
trimmed++;
} else {
break;
}
}
len -= trimmed;
c += trimmed;
}
if (mode & 2) {
if (len > 0) {
i = len - 1;
do {
if ((unsigned char)c[i] <= &#39; &#39; &&
(c[i] == &#39; &#39; || c[i] == &#39;&#39; || c[i] == &#39;&#39; || c[i] == &#39; &#39; || c[i] == &#39;&#39; || c[i] == &#39;&#39;)) {
len--;
} else {
break;
}
} while (i-- != 0);
}
}
}
if (ZSTR_LEN(str) == len) {
return zend_string_copy(str);
} else {
return zend_string_init(c, len, 0);
}
}
/* {{{ php_charmask
* Fills a 256-byte bytemask with input. You can specify a range like &#39;a..z&#39;,
* it needs to be incrementing.
* Returns: FAILURE/SUCCESS whether the input was correct (i.e. no range errors)
*/
static inline int php_charmask(unsigned char *input, size_t len, char *mask)
{
unsigned char *end;
unsigned char c;
int result = SUCCESS;
memset(mask, 0, 256);
for (end = input+len; input < end; input++) {
c=*input;
if ((input+3 < end) && input[1] == &#39;.&#39; && input[2] == &#39;.&#39;
&& input[3] >= c) {
memset(mask+c, 1, input[3] - c + 1);
input+=3;
} else if ((input+1 < end) && input[0] == &#39;.&#39; && input[1] == &#39;.&#39;) {
/* Error, try to be as helpful as possible:
(a range ending/starting with &#39;.&#39; won&#39;t be captured here) */
if (end-len >= input) { /* there was no &#39;left&#39; char */
php_error_docref(NULL, E_WARNING, "Invalid &#39;..&#39;-range, no character to the left of &#39;..&#39;");
result = FAILURE;
continue;
}
if (input+2 >= end) { /* there is no &#39;right&#39; char */
php_error_docref(NULL, E_WARNING, "Invalid &#39;..&#39;-range, no character to the right of &#39;..&#39;");
result = FAILURE;
continue;
}
if (input[-1] > input[2]) { /* wrong order */
php_error_docref(NULL, E_WARNING, "Invalid &#39;..&#39;-range, &#39;..&#39;-range needs to be incrementing");
result = FAILURE;
continue;
}
/* FIXME: better error (a..b..c is the only left possibility?) */
php_error_docref(NULL, E_WARNING, "Invalid &#39;..&#39;-range");
result = FAILURE;
continue;
} else {
mask[c]=1;
}
}
return result;
}
/* }}} */

The above is the detailed content of How to solve the problem of php trim garbled characters. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
ACID vs BASE Database: Differences and when to use each.ACID vs BASE Database: Differences and when to use each.Mar 26, 2025 pm 04:19 PM

The article compares ACID and BASE database models, detailing their characteristics and appropriate use cases. ACID prioritizes data integrity and consistency, suitable for financial and e-commerce applications, while BASE focuses on availability and

PHP Secure File Uploads: Preventing file-related vulnerabilities.PHP Secure File Uploads: Preventing file-related vulnerabilities.Mar 26, 2025 pm 04:18 PM

The article discusses securing PHP file uploads to prevent vulnerabilities like code injection. It focuses on file type validation, secure storage, and error handling to enhance application security.

PHP Input Validation: Best practices.PHP Input Validation: Best practices.Mar 26, 2025 pm 04:17 PM

Article discusses best practices for PHP input validation to enhance security, focusing on techniques like using built-in functions, whitelist approach, and server-side validation.

PHP API Rate Limiting: Implementation strategies.PHP API Rate Limiting: Implementation strategies.Mar 26, 2025 pm 04:16 PM

The article discusses strategies for implementing API rate limiting in PHP, including algorithms like Token Bucket and Leaky Bucket, and using libraries like symfony/rate-limiter. It also covers monitoring, dynamically adjusting rate limits, and hand

PHP Password Hashing: password_hash and password_verify.PHP Password Hashing: password_hash and password_verify.Mar 26, 2025 pm 04:15 PM

The article discusses the benefits of using password_hash and password_verify in PHP for securing passwords. The main argument is that these functions enhance password protection through automatic salt generation, strong hashing algorithms, and secur

OWASP Top 10 PHP: Describe and mitigate common vulnerabilities.OWASP Top 10 PHP: Describe and mitigate common vulnerabilities.Mar 26, 2025 pm 04:13 PM

The article discusses OWASP Top 10 vulnerabilities in PHP and mitigation strategies. Key issues include injection, broken authentication, and XSS, with recommended tools for monitoring and securing PHP applications.

PHP XSS Prevention: How to protect against XSS.PHP XSS Prevention: How to protect against XSS.Mar 26, 2025 pm 04:12 PM

The article discusses strategies to prevent XSS attacks in PHP, focusing on input sanitization, output encoding, and using security-enhancing libraries and frameworks.

PHP Interface vs Abstract Class: When to use each.PHP Interface vs Abstract Class: When to use each.Mar 26, 2025 pm 04:11 PM

The article discusses the use of interfaces and abstract classes in PHP, focusing on when to use each. Interfaces define a contract without implementation, suitable for unrelated classes and multiple inheritance. Abstract classes provide common funct

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment