Home  >  Article  >  Backend Development  >  Blank problem in web page UTF8 encoding development_PHP tutorial

Blank problem in web page UTF8 encoding development_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 17:37:36958browse

A problem that has never been solved during development
The page is encoded in UTF8, and the header and tail are included in the template. As a result, there is an extra blank line of about 10px at the head and the tail without any reason, and there is nothing.
The reason is that they are all encoded in utf8. When files are included, the final binary stream contains multiple UTF8 BOM tags. IE cannot parse pages containing multiple UTF8 BOM tags normally and directly replaces them with the actual displayed carriage returns, which results in an Blank lines, but Firefox does not have this problem.
Therefore, if the template uses the inclusion method to contain multiple utf8 files and needs to be saved with ultraedit, select the save as function and save it in utf8 without BOM format.
In addition, if the Chinese page puts the title tag in front of in the html head tag, the page will be blank.
​So utf8 pages should use the standard order










BOM header: xEFxBBxBF, PHP4 and 5 still ignore BOM, so they are output directly before parsing.
There is a dedicated description of this issue in the w3.org standard FAQ:
http://www.w3.org/International/questions/qa-utf8-bom
The details are as follows:
There is a character called "ZERO WIDTH NO-BREAK SPACE" in UCS encoding, and its encoding is FEFF. FFFE is a character that does not exist in UCS, so it should not appear in actual transmission. The UCS specification recommends that we transmit the characters "ZERO WIDTH NO-BREAK SPACE" before transmitting the byte stream. In this way, if the receiver receives FEFF, it indicates that the byte stream is Big-Endian; if it receives FFFE, it indicates that the byte stream is Little-Endian. Therefore, the character "ZERO WIDTH NO-BREAK SPACE" is also called BOM.
UTF-8 does not require a BOM to indicate the byte order, but can use the BOM to indicate the encoding method. The UTF-8 encoding of the character "ZERO WIDTH NO-BREAK SPACE" is EF BB BF. So if the receiver receives a byte stream starting with EF BB BF, it knows that it is UTF-8 encoded.
Windows is an operating system that uses BOM to mark the encoding method of text files: WindowsXP Professional, default character set: Chinese
1) Notepad: It can automatically identify UTF-8 encoded format files without BOM, but it cannot control whether to add BOM when saving the file. If the file is saved, BOM will be added uniformly.
2) editplus: cannot automatically recognize UTF-8 encoding format files without BOM. When saving the file, select UTF-8 format and will not write BOM header in the file header.
3) UltraEdit: The most powerful function for character encoding, it can automatically identify utf-8 files with and without bom (can be configured); when saving, you can choose whether to add bom through configuration.
(It is important to note that when saving a newly created file, you need to choose to save it as utf-8 no bom format)
Later I discovered that Notepad ++ also has better support for utf-8 BOM, and I recommend everyone to use it.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/486546.htmlTechArticleA problem that has never been solved during development. The page uses UTF8 encoding, and the header and tail use template inclusion files. method, the result is that there is an extra blank line of about 10px at the head and at the end...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn