Home  >  Article  >  Java  >  How to read text and pictures in Word tables using Java

How to read text and pictures in Word tables using Java

王林
王林forward
2023-05-03 16:04:061343browse

1. Program environment preparation

  • Code compilation tool: IntelliJ IDEA

  • ##Jdk version: 1.8.0

  • Test document: Word .docx 2013

  • Jar package: free spire.doc.jar 3.9.0


The Word document used for testing is as follows:

How to read text and pictures in Word tables using Java

Jar import steps and methods:

Method 1: Manual import.

Open the Project Structure (Shift Ctrl Alt S) interface, select [Modules]-[Dependencies], click " ", [JARs or directories...], select the jar package in the local path, add it, and check , click "OK" or "Apply" to import the jar.

How to read text and pictures in Word tables using Java

Method 2: Maven warehouse import.

You need to configure the maven path in the pom.xml file and specify the dependencies of free spire.doc.jar 3.9.0, and then download and import it. The specific configuration is as follows:

<repositories>
        <repository>
            <id>com.e-iceblue</id>
            <url>http://repo.e-iceblue.cn/repository/maven-public/</url>
        </repository>
    </repositories>
<dependencies>
    <dependency>
        <groupid> e-iceblue </groupid>
        <artifactid>free.spire.doc</artifactid>
        <version>3.9.0</version>
    </dependency>
</dependencies>
2. Java code

import com.spire.doc.*;
import com.spire.doc.documents.Paragraph;
import com.spire.doc.fields.DocPicture;
import com.spire.doc.interfaces.ITable;

import javax.imageio.ImageIO;
import java.awt.image.RenderedImage;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class GetTable {
    public static void main(String[] args)throws IOException {
        //加载Word测试文档
        Document doc = new Document();
        doc.loadFromFile("inputfile.docx");

        //获取第一节
        Section section = doc.getSections().get(0);

        //获取第一个表格
        ITable table = section.getTables().get(0);

        //创建txt文件(用于写入表格中提取的文本)
        String output = "ReadTextFromTable.txt";
        File textfile = new File(output);
        if (textfile.exists())
        {
            textfile.delete();
        }
        textfile.createNewFile();
        FileWriter fw = new FileWriter(textfile, true);
        BufferedWriter bw = new BufferedWriter(fw);

        //创建List
        List images = new ArrayList();

        //遍历表格中的行
        for (int i = 0; i 

3. Text and picture reading effect

After completing the code editing, execute the program and read the text in the table Data and pictures. The file path in the code is the IDEA project folder path, such as:

C:\Users\Administrator\IdeaProjects\Table_Doc\ReadTextFromTable.txt

C:\Users\Administrator\IdeaProjects\Table_Doc \Extracted table image-0.png

C:\Users\Administrator\IdeaProjects\Table_Doc\inputfile.docx

In the code, the file path can be customized to other paths.

Text data reading result:

How to read text and pictures in Word tables using Java

Image reading result:

How to read text and pictures in Word tables using Java

The above is the detailed content of How to read text and pictures in Word tables using Java. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete