search
HomeBackend DevelopmentPHP ProblemHow to intercept Chinese without garbled characters in php

With the continuous development of the Internet, the demand for Web applications is growing day by day. As a commonly used Web programming language, PHP has a large developer group and a wide range of application scenarios. In the PHP development process, intercepting Chinese strings is a common requirement. However, if you directly use PHP built-in functions to intercept Chinese strings, garbled characters will appear. This article will introduce how to use PHP to intercept Chinese strings without garbled characters.

1. Problems with PHP Chinese string interception

In PHP, there are three functions for intercepting strings: substr() function, mb_substr() function and iconv_substr() function. However, when using the substr() function to intercept Chinese strings, since the substr() function intercepts in bytes, and the number of bytes occupied by Chinese characters is 2 or 3, it will cause the intercepted characters to The string is garbled. As shown below:

$str = '我是中国人';
echo substr($str, 0, 6);//截取前6个字符

Run the above code, the output result is "I am ä¸å›½". As you can see, this is a piece of garbled code, and the Chinese string is not intercepted correctly.

The use of the mb_substr() function and iconv_substr() function can solve the problem of intercepting garbled Chinese strings. They both support UTF-8 encoded Chinese string interception. The usage of these two functions is introduced below.

2. The mb_substr() function intercepts Chinese without garbled characters

The mb_substr() function is a function specifically used to intercept strings in PHP. It supports multiple languages, including Chinese. Using this function to intercept Chinese strings can avoid garbled characters. The parameters of this function are as follows:

mb_substr(string $str, int $start, int $length, string $encoding)

This function has four parameters, which are:

  • $str: the string that needs to be intercepted;
  • $start: The starting position of interception, starting from 0;
  • $length: The length of interception, if it is a negative number, it will be intercepted to the end of the string;
  • $encoding: The string encoding method, usually UTF- 8 encoding.

For example, the following code uses the mb_substr() function to intercept Chinese strings:

$str = '我是中国人';
echo mb_substr($str, 0, 6, 'utf-8');//截取前6个字符

Run the above code, the output result is "I am China".

3. The iconv_substr() function intercepts Chinese without garbled characters

In addition to the mb_substr() function, the iconv_substr() function can also solve the problem of intercepting garbled Chinese strings. The iconv_substr() function is also a function specifically used to intercept strings in PHP. It is different from the mb_substr() function in that its fourth parameter represents the source encoding of the string, not the target encoding. It should be noted here that the source encoding parameter must be consistent with the actual string encoding. The parameters of this function are as follows:

iconv_substr(string $str,int $start, int $length = NULL, string $charset = ini_get('iconv.internal_encoding'))

This function has four parameters, which are:

  • $str: the string that needs to be intercepted;
  • $start: The starting position of interception, starting from 0;
  • $length: The length of interception, if it is NULL, it will be intercepted to the end of the string;
  • $charset: The encoding method of the source string, usually UTF-8 encoding.

For example, the following code uses the iconv_substr() function to intercept Chinese strings:

$str = '我是中国人';
echo iconv_substr($str, 0, 6, 'utf-8');//截取前6个字符

Run the above code, the output result is "I am China".

4. Summary

In Web application development, intercepting Chinese strings is a common requirement. Although the PHP built-in function substr() function can intercept strings, because it intercepts in bytes, it cannot handle Chinese characters correctly, which will cause the intercepted string to be garbled. Therefore, we can use the mb_substr() function or iconv_substr() function to solve this problem. Both functions support UTF-8 encoded Chinese string interception and can avoid garbled characters.

The above is the detailed content of How to intercept Chinese without garbled characters in php. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Atom editor mac version download

Atom editor mac version download

The most popular open source editor