Home >Backend Development >PHP Problem >How to intercept Chinese without garbled characters in php

How to intercept Chinese without garbled characters in php

PHPz
PHPzOriginal
2023-04-24 10:50:541128browse

With the continuous development of the Internet, the demand for Web applications is growing day by day. As a commonly used Web programming language, PHP has a large developer group and a wide range of application scenarios. In the PHP development process, intercepting Chinese strings is a common requirement. However, if you directly use PHP built-in functions to intercept Chinese strings, garbled characters will appear. This article will introduce how to use PHP to intercept Chinese strings without garbled characters.

1. Problems with PHP Chinese string interception

In PHP, there are three functions for intercepting strings: substr() function, mb_substr() function and iconv_substr() function. However, when using the substr() function to intercept Chinese strings, since the substr() function intercepts in bytes, and the number of bytes occupied by Chinese characters is 2 or 3, it will cause the intercepted characters to The string is garbled. As shown below:

$str = '我是中国人';
echo substr($str, 0, 6);//截取前6个字符

Run the above code, the output result is "I am ä¸å›½". As you can see, this is a piece of garbled code, and the Chinese string is not intercepted correctly.

The use of the mb_substr() function and iconv_substr() function can solve the problem of intercepting garbled Chinese strings. They both support UTF-8 encoded Chinese string interception. The usage of these two functions is introduced below.

2. The mb_substr() function intercepts Chinese without garbled characters

The mb_substr() function is a function specifically used to intercept strings in PHP. It supports multiple languages, including Chinese. Using this function to intercept Chinese strings can avoid garbled characters. The parameters of this function are as follows:

mb_substr(string $str, int $start, int $length, string $encoding)

This function has four parameters, which are:

  • $str: the string that needs to be intercepted;
  • $start: The starting position of interception, starting from 0;
  • $length: The length of interception, if it is a negative number, it will be intercepted to the end of the string;
  • $encoding: The string encoding method, usually UTF- 8 encoding.

For example, the following code uses the mb_substr() function to intercept Chinese strings:

$str = '我是中国人';
echo mb_substr($str, 0, 6, 'utf-8');//截取前6个字符

Run the above code, the output result is "I am China".

3. The iconv_substr() function intercepts Chinese without garbled characters

In addition to the mb_substr() function, the iconv_substr() function can also solve the problem of intercepting garbled Chinese strings. The iconv_substr() function is also a function specifically used to intercept strings in PHP. It is different from the mb_substr() function in that its fourth parameter represents the source encoding of the string, not the target encoding. It should be noted here that the source encoding parameter must be consistent with the actual string encoding. The parameters of this function are as follows:

iconv_substr(string $str,int $start, int $length = NULL, string $charset = ini_get('iconv.internal_encoding'))

This function has four parameters, which are:

  • $str: the string that needs to be intercepted;
  • $start: The starting position of interception, starting from 0;
  • $length: The length of interception, if it is NULL, it will be intercepted to the end of the string;
  • $charset: The encoding method of the source string, usually UTF-8 encoding.

For example, the following code uses the iconv_substr() function to intercept Chinese strings:

$str = '我是中国人';
echo iconv_substr($str, 0, 6, 'utf-8');//截取前6个字符

Run the above code, the output result is "I am China".

4. Summary

In Web application development, intercepting Chinese strings is a common requirement. Although the PHP built-in function substr() function can intercept strings, because it intercepts in bytes, it cannot handle Chinese characters correctly, which will cause the intercepted string to be garbled. Therefore, we can use the mb_substr() function or iconv_substr() function to solve this problem. Both functions support UTF-8 encoded Chinese string interception and can avoid garbled characters.

The above is the detailed content of How to intercept Chinese without garbled characters in php. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn