Home  >  Article  >  Backend Development  >  How to solve the problem of garbled characters when php generates UTF-8 encoded CSV files and opens them with Excel

How to solve the problem of garbled characters when php generates UTF-8 encoded CSV files and opens them with Excel

伊谢尔伦
伊谢尔伦Original
2016-12-02 09:59:091300browse

Reason: There is no BOM in the output CSV file.

What is BOM?

There is a character called "ZERO WIDTH NO-BREAK SPACE" in UCS encoding, and its encoding is FEFF. FFFE is a character that does not exist in UCS, so it should not appear in actual transmission. The UCS specification recommends that we transmit the characters "ZERO WIDTH NO-BREAK SPACE" before transmitting the byte stream. In this way, if the receiver receives FEFF, it indicates that the byte stream is Big-Endian; if it receives FFFE, it indicates that the byte stream is Little-Endian. Therefore, the character "ZERO WIDTH NO-BREAK SPACE" is also called BOM.

UTF-8 does not require a BOM to indicate the byte order, but can use the BOM to indicate the encoding method. The UTF-8 encoding of the character "ZERO WIDTH NO-BREAK SPACE" is EF BB BF. So if the receiver receives a byte stream starting with EF BB BF, it knows that it is UTF-8 encoded.

Windows uses BOM to mark the encoding method of text files.

How to output BOM in PHP?

Before everything is output

print(chr(0xEF).chr(0xBB).chr(0xBF));

Sample code:

<?php
    function writeCsvToFile($file,array $data){
        $fp = fopen($file, &#39;w&#39;);
        //Windows下使用BOM来标记文本文件的编码方式
        fwrite($fp,chr(0xEF).chr(0xBB).chr(0xBF));
        foreach ($data as $line) {
            fputcsv($fp, $line);
        }
        fclose($fp);
    }


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn