Home > Article > Backend Development > Detailed explanation of PHP’s Chinese conversion function
With the development of the Internet, more and more websites and applications have begun to involve cross-language issues. As a special language, Chinese is relatively difficult to encode and convert. In the PHP language, a wealth of Chinese conversion functions are provided. This article will introduce these functions in detail.
1. Chinese encoding
urlencode() function can encode Chinese characters and convert them into %XX form, where XX is the hexadecimal representation of the character in the character set. For example, the word "中文" will be converted to "中文" after using the urlencode() function.
Example:
$str = "中文"; echo urlencode($str); // 输出 %E4%B8%AD%E6%96%87
rawurlencode() function has basically the same function as urlencode() function, the difference is rawurlencode( ) function does not encode spaces, but converts them to " " signs.
Example:
$str = "中文 test"; echo rawurlencode($str); // 输出 %E4%B8%AD%E6%96%87+test
The urldecode() function can decode a string encoded using the urlencode() function. Convert the characters in the form of %XX into corresponding Chinese characters.
Example:
$str = "%E4%B8%AD%E6%96%87"; echo urldecode($str); // 输出 中文
rawurldecode() function has the same function as urldecode() function, the difference is rawurldecode() function The " " sign will be converted into a space.
Example:
$str = "%E4%B8%AD%E6%96%87+test"; echo rawurldecode($str); // 输出 中文 test
2. Chinese conversion
iconv() function can complete the conversion between different encodings Conversion, including commonly used encoding formats such as utf-8, gbk, big5, etc. The syntax format is:
iconv($in_charset, $out_charset, $string);
where $in_charset represents the encoding format of the input string, $out_charset represents the encoding format of the output string, and $string represents the string to be converted.
For example, convert a utf-8 encoded string into a gbk encoded string:
$str = "中文"; $str = iconv("utf-8", "gbk", $str); echo $str; // 输出乱码,应该在gbk编码的环境下查看
Note: garbled characters may appear after the iconv() function is converted. This is mainly due to the The correspondence between characters in the two encodings may not exist and therefore cannot be converted correctly. A solution to this problem can be using the Unicode conversion method.
mb_convert_encoding() function can also complete the conversion between different encodings. The difference from the iconv() function is that its use is more flexible and can Specify more conversion options. The syntax format is:
mb_convert_encoding($string, $to_encoding, $from_encoding);
where $string represents the string to be converted, $to_encoding represents the converted encoding format, and $from_encoding represents the encoding format of the original string.
For example, convert a utf-8 encoded string to a gbk encoded string:
$str = "中文"; $str = mb_convert_encoding($str, "gbk", "utf-8"); echo $str; // 输出乱码,应该在gbk编码的环境下查看
$str = "中文"; $str = utf8_encode($str); echo $str; // 输出ä¸æ–‡Note: garbled characters may appear after the utf8_encode() function is converted, so you should be cautious use.
$ord1 = ord("中"); // 取得字符"中"的UTF-8编码的第一个字节的值 $ord2 = ord(substr("中", 1)); // 取得字符"中"的UTF-8编码的第二个字节的值 $str = chr(0xe4) . chr(0xb8) . chr(0xad); // 使用chr()函数转换为UTF-8编码的字符串 echo $str; // 输出 "中"Note: When using the chr() function and ord() function, you must carefully consider the encoding of different character sets difference. 3. Chinese length judgment
$str = "中文"; echo strlen($str); // 输出 6
$str = "中文"; echo mb_strlen($str); // 输出 2Note: When using the mb_strlen() function, you must specify the correct Chinese character set. If you don't know the character set, you can use the mb_detect_encoding() function to detect it.
The above is the detailed content of Detailed explanation of PHP’s Chinese conversion function. For more information, please follow other related articles on the PHP Chinese website!