Home  >  Article  >  Backend Development  >  Blank problem in UTF8 encoding development of PHP web pages_PHP tutorial

Blank problem in UTF8 encoding development of PHP web pages_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 17:37:24756browse

A problem that has never been solved during development
The page is encoded in UTF8, and the template inclusion file method is used at the head and tail. As a result, there is an extra space of about 10px at the head and tail. OK, nothing.
The reason is that all UTF8 encoding is used. When including files, the final binary stream contains multiple UTF8 BOM tags. IE cannot parse pages containing multiple UTF8 BOM tags normally and directly replaces them with the actual displayed carriage return. This results in a blank line, but Firefox does not have this problem.
Therefore, if the template uses the inclusion method to contain multiple utf8 files and needs to be saved with ultraedit, just select utf8 and save it in no BOM format.
In addition, if the Chinese page puts the title tag in front of in the html head tag, it will cause the page to blank.
So utf8 pages should use the standard order





< meta name=”description” content=”” />




BOM header: xEFxBBxBF, PHP4 and 5 still ignore BOM, so they are output directly before parsing.
There is a dedicated description of this issue in the w3.org standard FAQ:

http://www.w3.org/International/questions/qa-utf8-bom

The details are as follows:

There is a character called "ZERO WIDTH NO-BREAK SPACE" in UCS encoding, and its encoding is FEFF. FFFE is a character that does not exist in UCS, so it should not appear in actual transmission. The UCS specification recommends that we transmit the characters "ZERO WIDTH NO-BREAK SPACE" before transmitting the byte stream. In this way, if the receiver receives FEFF, it indicates that the byte stream is Big-Endian; if it receives FFFE, it indicates that the byte stream is Little-Endian. Therefore, the character "ZERO WIDTH NO-BREAK SPACE" is also called BOM.

UTF-8 does not require a BOM to indicate the byte order, but can use the BOM to indicate the encoding method. The UTF-8 encoding of the character "ZERO WIDTH NO-BREAK SPACE" is EF BB BF. So if the receiver receives a byte stream starting with EF BB BF, it knows that it is UTF-8 encoded.

Windows is an operating system that uses BOM to mark the encoding method of text files: WindowsXP Professional, default character set: Chinese

1) Notepad: It can automatically identify UTF-8 encoded format files without BOM, but it cannot control whether to add BOM when saving the file. If the file is saved, BOM will be added uniformly.

2) editplus: cannot automatically recognize UTF-8 encoding format files without BOM. When saving the file, select UTF-8 format and will not write BOM header in the file header.

3) UltraEdit: The most powerful function for character encoding, it can automatically identify utf-8 files with BOM and without BOM (can be configured); when saving, you can choose whether to add BOM through configuration.

(It is important to note that when saving a newly created file, you need to choose to save it as utf-8 no bom format)

Later I discovered that Notepad ++ also has better support for utf-8 BOM, and I recommend everyone to use it.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/486563.htmlTechArticleA problem that has never been solved during development. The page uses UTF8 encoding, and the header and tail use template inclusion files. method, the result is that there is an extra blank line of about 10px at the head and at the end...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn