Home >Backend Development >PHP Tutorial >How to solve the problem of garbled characters when php generates UTF-8 encoded CSV files and opens them with Excel
Reason: There is no BOM in the output CSV file.
What is BOM?
There is a character called "ZERO WIDTH NO-BREAK SPACE" in UCS encoding, and its encoding is FEFF. FFFE is a character that does not exist in UCS, so it should not appear in actual transmission. The UCS specification recommends that we transmit the characters "ZERO WIDTH NO-BREAK SPACE" before transmitting the byte stream. In this way, if the receiver receives FEFF, it indicates that the byte stream is Big-Endian; if it receives FFFE, it indicates that the byte stream is Little-Endian. Therefore, the character "ZERO WIDTH NO-BREAK SPACE" is also called BOM.
UTF-8 does not require a BOM to indicate the byte order, but can use the BOM to indicate the encoding method. The UTF-8 encoding of the character "ZERO WIDTH NO-BREAK SPACE" is EF BB BF. So if the receiver receives a byte stream starting with EF BB BF, it knows that it is UTF-8 encoded.
Windows uses BOM to mark the encoding method of text files.
How to output BOM in PHP?
Before everything is output
print(chr(0xEF).chr(0xBB).chr(0xBF));
Sample code:
<?php function writeCsvToFile($file,array $data){ $fp = fopen($file, 'w'); //Windows下使用BOM来标记文本文件的编码方式 fwrite($fp,chr(0xEF).chr(0xBB).chr(0xBF)); foreach ($data as $line) { fputcsv($fp, $line); } fclose($fp); }