search
HomeBackend DevelopmentPHP ProblemHow to solve the problem of Chinese garbled characters in PHP reading word

Solution for php to read Chinese garbled characters in word: 1. Check the php version; 2. Modify php.ini and restart the server; 3. Pass "iconv('GB2312', 'UTF-8', $test );" Just solve the problem of garbled characters during the reading process.

How to solve the problem of Chinese garbled characters in PHP reading word

#The operating environment of this article: Windows 7 system, PHP version 7.4, Dell G3 computer.

How to solve the problem of Chinese garbled characters when PHP reads word?

When php reads the word file, the characters are garbled. How to solve it?

1. First, confirm the php version, preferably higher than 5.6

2. Enable PHP Com extension

// 将以下两行代码放入php.ini中, 并且重启服务器
// 开启扩展
extension=php_com_dotnet.dll
// COM扩展里自带的,只需将前面的;去掉就可以了
com.allow_dcom = true

3. The code is as follows:

    public function readWord($url)
    {
        $word = new COM("word.application") or die("Unable to instantiate Word");

        // 打开路径为URL的word,doc或docx都可以
        $word->Documents->OPen($url);

        // 读取内容
        $test= $word->ActiveDocument->content->Text;

        // 统计字数
        // $num = strlen($test);

        // 解决读取过程中乱码问题
        $content= iconv('GB2312', 'UTF-8', $test);

        // 查看版本
        // $word_wersion = $word->Version;

        // 是否要打开文件,0代表否,1代表是
        $word->Visible = 0;

        // 关闭word句柄
        $word->Quit();

        // 释放对象
        $word = null;

        return [
            // 'num' => $num / 2,
            // 'word_wersion' => $wordWersion,
            'content' => $content
        ];
    }

Note:

Question 1:

There is one thing you need to pay attention to in the file url, that is, the file URL you passed The incoming url must not be an absolute address, such as D:\WWW\. It must be the routing address of your own framework, such as localhost/..., otherwise errors will occur, because using absolute addresses to read word content can only Read once, then the word will be locked and then cannot be read.

Question 2:

Although using this method solves the problem of reading garbled Word content, it is only limited to reading plain text Word, and it is the kind without styles. If you need to obtain the content of a Word document including styles, pictures, fonts, etc., this method is not suitable.

The way we deal with it is to use Aspos. I made a bottom-level service using Java to convert uploaded Word documents into HTML format. If there are pictures in the document, the pictures will be extracted to the same level directory after conversion, and will be left in the generated HTML file. <img alt="How to solve the problem of Chinese garbled characters in PHP reading word" > tag. In this way, the fonts and styles in the Word document become HTML codes, retaining the style of the original document to the greatest extent.

Recommended learning: "PHP Video Tutorial"

The above is the detailed content of How to solve the problem of Chinese garbled characters in PHP reading word. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor