首頁 >後端開發 >Python教學 >如何使用 Python 從 PDF 中提取圖像，同時保留其原始解析度？

如何使用 Python 從 PDF 中提取圖像，同時保留其原始解析度？

DDD原創: 2024-10-22 07:52:30796瀏覽

How Can You Extract Images from PDFs Using Python While Preserving Their Original Resolution?

使用Python 從PDF 中提取圖像而無需重新採樣

高效地從PDF 文件中提取所有圖像，同時保留其原始分辨速率和格式而無需重新取樣，您可以使用PyMuPDF 模組。此模組為影像擷取提供了有效的解決方案，將影像輸出為PNG檔案。

使用PyMuPDF：

<code class="python">import fitz

# Open the PDF document
doc = fitz.open("file.pdf")

# Iterate through the pages
for i in range(len(doc)):
    # Extract images from the current page
    for img in doc.getPageImageList(i):
        # Retrieve the image's XREF and create a Pixmap
        xref = img[0]
        pix = fitz.Pixmap(doc, xref)

        # Check if the image is grayscale or RGB
        if pix.n <p><strong>增強功能：</strong></p>
<p>支援fitz 1.19.6 的腳本更新版本：</p>
<pre class="brush:php;toolbar:false"><code class="python">import os
import fitz
from tqdm import tqdm

# Specify the work directory
workdir = "your_folder"

# Iterate through the PDFs in the directory
for each_path in os.listdir(workdir):
    if ".pdf" in each_path:
        # Open the PDF document
        doc = fitz.Document(os.path.join(workdir, each_path))

        for i in tqdm(range(len(doc)), desc="pages"):
            for img in tqdm(doc.get_page_images(i), desc="page_images"):
                # Extract the image and save as PNG
                xref = img[0]
                image = doc.extract_image(xref)
                pix = fitz.Pixmap(doc, xref)
                pix.save(os.path.join(workdir, "%s_p%s-%s.png" % (each_path[:-4], i, xref)))</code>

此增強的腳本提供進度條以增加可見性，並使用一致的檔案命名約定保存提取的影像。

以上是如何使用 Python 從 PDF 中提取圖像，同時保留其原始解析度？的詳細內容。更多資訊請關注PHP中文網其他相關文章！

Python for while format using this

陳述：

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

上一篇：如何使用 Python 從 PDF 提取高解析度圖像而無需重新採樣？下一篇：如何使用 Python 從 PDF 提取高解析度圖像而無需重新採樣？

看更多