Home  >  Article  >  Backend Development  >  Detailed explanation of PHP character encoding problem

Detailed explanation of PHP character encoding problem

WBOY
WBOYOriginal
2016-07-25 08:53:40867browse
  1. Page title
  2. Hello!

Copy the code

Use IE browser to open this page of the website. As you can see, the page displays normally. Under the "View"/"Encoding" menu of the Internet Explorer browser ("Automatic selection" is checked), the character encoding is gb2312. [Displays normally under firefox 2.0. ] 1.2 Then, under the "File" menu of ultraedit, select "Save As", select "utf-8" as the format, and the file name is test2.php. Open this page using Internet Explorer. As you can see, the page displays normally (in fact, the English font has changed slightly). I saw under the "View"/"Encoding" menu of the IE browser ("Automatic selection" was checked) that the character encoding is UTF-8, which changed automatically! Note: This sentence has not been modified, but the browser automatically recognizes the real character set encoding! It seems that IE is quite smart, which also shows that IE's automatic identification of character sets takes precedence over the definition of charset=xxx in the mete tag. [Garbled characters are displayed under firefox 2.0. ] 1.3 Add a statement at the beginning of the page

  1. header("content-type:text/html;charset=utf-8");
  2. ?>
Copy the code

Save the page file again," Select "Default" from the "Format" drop-down box, and the file name is test3.php. Use IE to open the file on the website, and this time I see that except for the English letters, the Chinese characters have become garbled! At the same time, I saw under the "View"/"Encoding" menu of the IE browser ("Automatic selection" was checked) that the character encoding is UTF-8, which has been forcibly changed. The reason why garbled Chinese characters appear is because the original gb2312 encoding is forced to be displayed in utf-8 encoding, so garbled characters appear. At this time, the gb2312 encoding is manually specified in the browser, and the Chinese characters on the page are displayed normally again (this cannot be done when actually creating the page. The viewer must choose the encoding by himself. One is that the viewer may not know how to choose the encoding or selection at all. What kind of coding, and it seems that we are too good!). [Garbled characters are displayed under firefox 2.0. ] 1.4 Add a statement at the beginning of the page

  1. header("content-type:text/html;charset=gb2312");
  2. ?>
Copy the code

Save the page file again, "Format" Select "utf-8" from the drop-down box, and the file name is test4.php. Use IE to open the file on the website, and it's strange: I see that the Chinese characters on the page are displayed normally, not the expected garbled characters? ! I can see under the "View"/"Encoding" menu of the IE browser ("Auto Select" is checked) that the character encoding is still UTF-8 and has not been forcibly changed to the gb2312 character set. At this time, I manually specified the gb2312 encoding in the browser and found that the IE browser could not manually specify the encoding. It seems that the IE browser pays special attention to the utf-8 character set. Regardless of whether it is specified in the meta tag or the PHP statement, the IE browser cannot display garbled Chinese characters. [Garbled characters are displayed under firefox 2.0. ] Summary: The above tests were mainly conducted under ie7.0, the web server was iis6.0 under windows server 2003, and the php version was 4.4.7. It can be seen that ie7.0 has done a lot of additional automatic processing work in order to correctly identify the character set to show its intelligence and friendliness. Sometimes being too diligent can overwhelm us. Since the problem of garbled Chinese characters is related to different browsers and their different versions, web servers, background scripts and different character sets, the problem is particularly complicated. As a web programmer, it's okay to focus primarily on factors that concern you. There's no need to become an expert in character set encoding. In order to be compatible with the currently popular IE and FF browsers, we can process our PHP code in the following simple ways: 1. The actual character set of the page should be consistent with that specified by the meta tag; 2. You can also use the header("content-type:text/html;charset=xxx"); statement to specify the character set, but it cannot conflict with the real character set of the character, nor with the meta tag. (Although test results show that when header() conflicts with meta, header() takes precedence over the character set specified by meta, because according to httpwatch basic tracking, after header() specifies the character set, the ie browser type will be clear Get character set specification, but there is no guarantee that other non-mainstream browsers will do the same.) 3. It cannot conflict with the character set of the characters retrieved from the database, otherwise the page will have the problem that all or part of the Chinese characters on the page and the Chinese characters retrieved from the database are garbled.



Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn