Home  >  Article  >  Backend Development  >  How to Merge PDF Files with Python: A Comprehensive Guide

How to Merge PDF Files with Python: A Comprehensive Guide

DDD
DDDOriginal
2024-10-23 08:30:29700browse

How to Merge PDF Files with Python: A Comprehensive Guide

Merging PDF Files with Python

Python offers powerful options for merging PDF files, allowing you to combine multiple documents into a single, unified one. This tutorial will guide you through the process, including advanced techniques like looping through directories and excluding specific pages.

Using pypdf Merging Class

pypdf provides the PdfMerger class, which offers an easy way to concatenate and merge PDF files.

File Concatenation

Concatenate files by appending them using the append method:

<code class="python">import PdfMerger

pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf', 'file4.pdf']

merger = PdfMerger()

for pdf in pdfs:
    merger.append(pdf)

merger.write("result.pdf")</code>

File Merging

For finer control, use the merge method to specify insertion points:

<code class="python">merger.merge(2, pdf)  # Insert PDF at page 2</code>

Page Ranges

Control which pages are appended using the pages keyword argument:

<code class="python">merger.append(pdf, pages=(0, 3))  # Append first 3 pages
merger.append(pdf, pages=(0, 6, 2))  # Append pages 1, 3, 5</code>

Excluding Blank Pages

To exclude a specific page from all merged PDFs, you can manipulate the pages parameter accordingly. For example, to exclude page 1 from each PDF:

<code class="python">pages_to_exclude = [0]  # Page 1

for pdf in pdfs:
    merger.append(pdf, pages=(i for i in range(pages) if i not in pages_to_exclude))</code>

PyMuPdf Library

Another option is the PyMuPdf library. Here's how to merge PDFs with it:

From Command Line

python -m fitz join -o result.pdf file1.pdf file2.pdf file3.pdf

From Code

<code class="python">import fitz

result = fitz.open()

for pdf in ['file1.pdf', 'file2.pdf', 'file3.pdf']:
    with fitz.open(pdf) as mfile:
        result.insert_pdf(mfile)
    result.save("result.pdf")</code>

Looping Through Folders

To loop through folders and merge PDFs, use the os module:

<code class="python">import os

for folder in os.listdir("path/to/directory"):
    pdfs = [f for f in os.listdir(f"path/to/directory/{folder}") if f.endswith(".pdf")]
    merger = PdfMerger()
    for pdf in pdfs:
        merger.append(f"path/to/directory/{folder}/{pdf}")
    merger.write(f"merged_{folder}.pdf")</code>

The above is the detailed content of How to Merge PDF Files with Python: A Comprehensive Guide. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn