Home >Backend Development >PHP Tutorial >Detailed explanation of how to use the mb_detect_encoding function in PHP_php tips

Detailed explanation of how to use the mb_detect_encoding function in PHP_php tips

WBOY
WBOYOriginal
2016-05-16 20:08:191171browse

You can use the mb_detect_encoding() function in PHP to determine the encoding of a string.

When using the mb_detect_encoding function in php for encoding identification, many people have encountered the problem of incorrect encoding, such as GB2312 and UTF-8, or UTF-8 and GBK (here mainly for cp936 Judgment), it is said on the Internet that mb_detect_encoding will misjudge due to short characters.

The code is as follows:

$encode = mb_detect_encoding($keytitle, array("ASCII","UTF-8","GB2312","GBK","BIG5"));
if($encode == "UTF-8"){
  $keytitle = iconv("UTF-8","GBK",$keytitle);
}

The purpose of this code is to detect whether the encoding of the string is UTF-8, and if so, convert it to GBK.
But when $keytitle = “оƬ”;. The detection result is UTF-8. This bug is not actually a bug, and you should not rely too much on mb_detect_encoding when writing programs. When the string is short, the detection results are likely to be biased.
Solution, the code is as follows:

$encode = mb_detect_encoding($keytitle, array("ASCII","GB2312","GBK","UTF-8");

The three parameters are: the input variable to be detected, the detection order of the encoding method (once it is true, it will be automatically ignored later), and the strict mode adjusts the order of encoding detection, putting the greatest possibility first, so as to reduce Chance of being converted incorrectly.
Generally, gb2312 should be sorted first. When there are GBK and UTF-8, the commonly used ones need to be sorted first.

Conversion and judgment of PHP string encoding


The conversion between GBK and UTF-8 encoding is a very disgusting thing. For example, json_encode in PHP itself does not support GBK encoding at all. There are two library functions that can support encoding conversion. The one that usually comes to mind is the iconv function, which is also very fun to use:

iconv('GBK', 'UTF-8//IGNORE', 'Test string'); // Convert the string from GBK encoding to UTF-8 encoding

But iconv can only solve the situation where the encoding is known in advance. If the string encoding is unknown, you need to detect its encoding first. In this case, the mb_string extension library may be used:

mb_detect_encoding('test string');

However, mb_detect_encoding has a flaw, and inaccurate judgment often occurs. Maybe this can solve it:


// 使用 iconv 转换并判断是否等值,效率不高
function is_utf8 ($str) {
  if ($str === iconv('UTF-8', 'UTF-8//IGNORE', $str)) {
    return 'UTF-8';
  }
}
// 多种编码的情况
function detect_encoding ($str) {
  foreach (array('GBK', 'UTF-8') as $v) {
    if ($str === iconv($v, $v . '//IGNORE', $str)) {
      return $v;
    }
  }
}

After obtaining the string encoding information through the above method, you can use iconv or mb_convert_encoding to convert the encoding.

Call to undefined function mb_detect_encoding() error resolution


Under Windows system:
1. Fatal error: Call to undefined function: iconv() in C:Program FilesAppServwww...xxx.php on line 82
There is an iconv() function in php for character encoding conversion
extension=php_iconv.dll
also exists in the php.ini file Such a statement can ensure the normal use of this function
If there is a semicolon
in front of extension=php_iconv.dll in the php.ini file It means it is commented out. If there is a semicolon, remove it and restart the server
Run the program again to solve the problem

2. Fatal error: Call to undefined function: mb_detect_encoding() in C:Program FilesAppServwww...xxx.php on line 1355
1. Find the php extension directory (the path of my php extension directory is: C:Program FilesAppServphpextensions)
Find the php_mbstring.dll file in the extensions directory,
2. Copy the php_mbstring.dll file to the directory where the php.ini file is located (the directory where my php.ini file is located: C:WINDOWS)
3. Open the php.ini file with Notepad and use the shortcut key Ctrl F to find extension=php_mbstring.dll
4. If extension=php_mbstring.dll exists in the php.ini file, remove the semicolon in front of extension in this line,
If it does not exist, add extension=php_mbstring.dll in the next line of other extension=...,
​ ​Finally remember to save the php.ini file
5. Restart your Apache server

I just discovered that it doesn’t work without copying the php_mbstring.dll file to the directory where php.ini is located

Linux system:

When the following problem occurs:

PHP 1. {main}() /home/xu/web/whois/ab.cn.php:0
PHP 2. base_func->is_exist() /home/xu/web/whois/ab.cn.php:21
PHP 3. strftime() /home/xu/web/whois/whois.mysql.php:46
ab.cnis existdf250b2156c434f3390392d09b1c9563PHP Fatal error: Call to undefined function mb_detect_encoding() in /home/xu/web/whois/whois.main.php on line 98
After searching various information on the Internet, I found out that the php plug-in php-mbstring was not installed (some students may not have enabled this plug-in in php.ini). I wrote this file for notes.

First use yum or apt to install mbstring.so, use the command: yum install php-mbstring or apt-get install php-mbstring (ubuntu users are best to use apt-cache search mbstring to search before installing, the software name may be Wrong)

Then modify php.ini: execute vim /etc/php.ini and add extension="/usr/lib/php/modules/mbstring.so" content. The following paths may be different. They are stored according to mbstring.so Just make the corresponding changes to the directory. Generally there is no need to change it.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn