Home >Backend Development >PHP Tutorial >A collection of solutions to Chinese garbled characters in PHP_PHP Tutorial
The first is the encoding of the PHP web page
1. The encoding of the php tutorial file itself should match the encoding of the web page
a. If you want to use gb2312 encoding, then php should output the header: header("Content-Type: text/html; charset=gb2312"), and add , the encoding format of all files is ANSI, you can open it with Notepad, save as, select the encoding as ANSI, and overwrite the source file.
b. If you want to use utf-8 encoding, then php should output the header: header("Content-Type: text/html; charset=utf-8"), and add , the encoding format of all files is utf-8. Saving as utf-8 may be a bit troublesome. Generally, utf-8 files will have BOM at the beginning. If you use session, there will be problems. You can use editplus to save. In editplus, go to Tools->Parameter Selection->File-> UTF-8 signature, select Always delete, then save to remove the BOM information.
2. PHP itself is not Unicode, all functions such as substr must be changed to mb_substr (mbstring extension needs to be installed); or use iconv to transcode.
2. Data interaction between PHP and Mysql
The coding of PHP and database tutorials should be consistent
1. Modify the mysql tutorial configuration file my.ini or my.cnf. It is best to use utf8 encoding for mysql
[mysql]
default-character-set=utf8
[mysqld]
default-character-set=utf8
default-storage-engine=MyISAM
Add under [mysqld]:
default-collation=utf8_bin
init_connect='SET NAMES utf8'
2. Add mysql_query("set names 'encoding'"); before the PHP program that needs to perform database operations. The encoding is consistent with the PHP encoding. If the PHP encoding is gb2312, then the mysql encoding is gb2312. If it is UTF-8, then MySQL encoding is utf8, so there will be no garbled characters when inserting or retrieving data
3. PHP is related to the operating system
The encoding of Windows and Linux is different. In the Windows environment, if the parameters are utf-8 encoded when calling PHP functions, errors will occur, such as move_uploaded_file(), filesize(), readfile(), etc. These functions It is often used when processing uploads and downloads. The following error may occur when calling:
Warning: move_uploaded_file()[function.move-uploaded-file]: failed to open stream: Invalid argument in ...
Warning: move_uploaded_file()[function.move-uploaded-file]:Unable to move '' to '' in ...
Warning: filesize() [function.filesize]: stat failed for ... in ...
Warning: readfile() [function.readfile]: failed to open stream: Invalid argument in ..
Although these errors will not occur when using gb2312 encoding in a Linux environment, the saved file name will be garbled and the file cannot be read. In this case, you can first convert the parameters into an encoding recognized by the operating system. For encoding conversion, you can use mb_convert_encoding( String, new encoding, original encoding) or iconv (original encoding, new encoding, string), so that the file name saved after processing will not be garbled, and the file can also be read normally, enabling uploading and downloading of Chinese name files. .
In fact, there is a better solution, which is to completely separate from the system, and there is no need to consider the encoding of the system. You can generate a sequence of only letters and numbers as the file name, and save the original name with Chinese characters in the database. In this way, there will be no problem when calling move_uploaded_file(). When downloading, you only need to change the file name to the original name with Chinese characters. Chinese name. The code to implement downloading is as follows
header("Pragma: public");
header("Expires: 0");
header("Cache-Component: must-revalidate, post-check=0, pre-check=0");
header("Content-type: $file_type");
header("Content-Length: $file_size");
header("Content-Disposition: attachment; filename="$file_name"");
header("Content-Transfer-Encoding: binary");
readfile($file_path);
$file_type is the type of file, $file_name is the original name, $file_path is the address of the file saved on the service
The encoding of PHP files and static web pages must be consistent.
1. When using utf-8 encoding, add the php file before all output:
header("Content-Type: text/html; charset=utf-8");
Static page added:
.
The encoding format of all files is utf-8. Saving as UTF-8 may be a bit troublesome. Software such as Notepad that comes with Windows will insert three invisible characters (0xEF 0xBB 0xBF, That is BOM - Byte Order Mark). It is a string of hidden characters used to let editors such as Notepad identify whether the file is encoded in UTF-8. For ordinary files, this will not cause any trouble.
But for PHP, PHP did not consider the BOM issue when designing. It will not ignore the three characters of the BOM at the beginning of the UTF-8 encoded file, and will use the BOM as part of the text at the beginning of the file. Since the code after or You can use EmEditor to save it. In EmEditor, go to Save As-> uncheck the unicode signature (BOM), and save it again to remove the BOM information.
2. Use gb2312 encoding, and add php files before all output:
header("Content-Type: text/html; charset=gb2312"),
Page added
The encoding format of all files is ANSI.
2. The encoding of PHP and database should be consistent
Taking the Mysql database as an example, add mysql_query("set names 'xx'"); before the PHP program that needs to perform database operations. If the PHP encoding is gb2312, then xx is gb2312. If it is utf-8, then xx is utf8 (it is utf8 instead of utf-8), so that there will be no garbled characters when operating data.
In addition, it is best to use utf8 encoding for mysql, and modify the mysql configuration file my.ini or my.cnf
[mysql]
default-character-set=utf8
[mysqld]
default-character-set=utf8
default-storage-engine=MyISAMAdd under [mysqld]:
default-collation=utf8_bin
init_connect='SET NAMES utf8'
The Chinese output by echo is displayed as garbled characters,
In fact, various server scripts should encounter this problem,
It’s basically an encoding problem,
Generally speaking, for encoding compatibility reasons, most pages define the page character set as utf-8
At this time, to display Chinese normally, you need to convert the encoding method, such as
echo iconv("GB2312","UTF-8",'Chinese'); will not be garbled
There are other methods, such as
Add header("Content-Type:text/html;charset=gb2312");
in front of php's echo Of course, the simplified Chinese page can also be simply,
Change UTF-8 in to gb2312
Encountering strange phenomena in practice,
When the page is displayed normally on the local server, when it is uploaded to the server, the echo will be garbled,
I haven’t thought about this reason carefully, because it is normal if I change the position and re-encode through the iconv function GB2312 and UTF-8,
But it is probably caused by APACHE, or more precisely, the different settings of the PHP server,
You should be able to solve it by taking a look at PHP.INI
Let’s summarize why the code is garbled
Generally speaking, there are two reasons for the appearance of garbled characters. First, it is due to an error in the encoding (charset) setting, which causes the browser to parse with the wrong encoding, resulting in a screen full of messy "heavenly books". Second, the file is Open it with the wrong encoding and then save it. For example, a text file was originally encoded in GB2312, but it was opened in UTF-8 encoding and then saved. To solve the above garbled code problem, you first need to know which aspects of development involve coding:
1. File encoding: refers to the encoding in which the page file (.html, .php, etc.) itself is saved. Notepad and Dreamweaver will automatically recognize the file encoding when opening the page, so there will be less problems. However, ZendStudio does not automatically recognize the encoding. It will only open the file in a certain encoding according to the configuration of the preferences. If you accidentally open the file with the wrong encoding while working, and save it after making the modification, garbled characters will appear ( I feel it deeply).
2. Page declaration encoding: In the HTML code HEAD, you can use to tell the browser that the web page uses What encoding is used? Currently, XXX mainly uses GB2312 and UTF-8 in Chinese website development
.3. Database connection encoding: refers to which encoding is used to transmit data to the database when performing database operations. It should be noted here that it should not be confused with the encoding of the database itself. For example, the default encoding of MySQL is latin1 encoding, which means Mysql Data is stored in latin1 encoding, and data transmitted to Mysql in other encodings will be converted into latin1 encoding.
Knowing where coding is involved in WEB development, you also know the reasons for garbled codes: the above three coding settings are inconsistent. Since most of the various codings are compatible with ASCII, English symbols will not appear, and Chinese characters will be unlucky. .
$mysql_server_name='localhost';
$mysql_username='root';
$mysql_password='000000';
$mysql_database='lib';
$conn=mysql_connect($mysql_server_name,$mysql_username,$mysql_password,$mysql_database);
$sql="select name,age from mytb";
print($conn);
$rs=mysql_db_query("lib","select * from mytb",$conn);
print("
");
while($row = mysql_fetch_object($rs)){
print ($row->name.":".$row->age."
");
}
mysql_close($conn);
?>
It is displayed as follows:
Resource id #1
dd:54
ddd:8
??:15
???:25
??:32
MySQL encoding: utf8, GBK have been tried. Both mysql font and command line display are correct.
Question supplement:
Garbled code:
???:15
???:25
??:32
In these lines, the values in the database are Chinese characters. What is displayed is a question mark.
Solution:
In $rs=mysql_db_query("lib","select * from mytb",$conn);
Preceded by
mysql_query("set names gb2312"); or mysql_query("set names gbk");
five. Fight some common error situations and solutions:
1. The database uses UTF8 encoding, and the page declaration encoding is GB2312. This is the most common cause of garbled characters. At this time, the direct SELECT data in the PHP script will be garbled. You need to use: mysql_query("SET NAMES GBK"); before querying to set the MYSQL connection encoding to ensure that the page declaration encoding is consistent with the connection encoding set here ( GBK is an extension of GB2312). If the page is UTF-8 encoded, you can use: mysql_query("SET NAMES UTF8");
Note that it is UTF8 rather than the commonly used UTF-8. If the encoding of the page declaration is consistent with the internal encoding of the database, you do not need to set the connection encoding.
Note: In fact, the data input and output of MYSQL is more complicated than what is mentioned above. There are 2 default encodings defined in the MYSQL configuration file my.ini, which are default-character-set and [mysqld] in [client]. The default-character-set in it is used to set the encoding used by the default client connection and the internal database. The encoding we specified above is actually the command line parameter character_set_client when the MYSQL client connects to the server, which tells the MYSQL server what encoding the client data received is, instead of using the default encoding.
2. The page declaration encoding is inconsistent with the encoding of the file itself. This rarely happens because if the encoding is inconsistent, what the artist sees in the browser when creating the page will be garbled characters. More often than not, it is caused by fixing some minor bugs after release, opening the page in the wrong encoding and then saving it. Or you use some FTP software to directly modify files online, such as CuteFTP. Due to incorrect software encoding configuration, the wrong encoding is converted.
3. Some friends who rent virtual hosts still have garbled codes even though the above three encodings are set correctly. For example, if the web page is encoded in GB2312, it is always recognized as UTF-8 when opened by browsers such as IE. The HEAD of the web page has already stated that it is GB2312. After manually changing the browser encoding to GB2312, the page displays normally. The reason is that the server Apache sets the global default encoding of the server and adds AddDefaultCharset UTF-8 in httpd.conf. At this time, the server will first send the HTTP header to the browser, and its priority is higher than the encoding declared in the page. Naturally, the browser will recognize it incorrectly. There are two solutions. Administrators should add AddDefaultCharset GB2312 to the configuration file of their own virtual machine to override the global configuration, or configure it in .htaccess in their own directory.