Home  >  Article  >  Backend Development  >  Python example of how to merge all PDF files in the same folder

Python example of how to merge all PDF files in the same folder

不言
不言Original
2018-04-28 10:22:502297browse

This article mainly introduces Python's method of merging all PDF files in the same folder. It involves Python's related operating skills for reading, judging, decrypting, writing and merging PDF files. Friends in need can refer to it.

The example in this article describes the method of merging all PDF files in the same folder in Python. Share it with everyone for your reference, the details are as follows:

1. Requirements Description

I downloaded Andrew Ng’s free deep learning pdf document from NetEase Cloud Classroom, but every The section is a PDF. I put these PDF documents in a folder and hope to merge them into one PDF file. So I wrote a python program, which solved this problem very well.

2. Data format

3. Merging effect

4. Python code implementation

# -*- coding:utf-8*-
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
import os
import os.path
from pyPdf import PdfFileReader,PdfFileWriter
import time
time1=time.time()
# 使用os模块walk函数,搜索出某目录下的全部pdf文件
######################获取同一个文件夹下的所有PDF文件名#######################
def getFileName(filepath):
  file_list = []
  for root,dirs,files in os.walk(filepath):
    for filespath in files:
      # print(os.path.join(root,filespath))
      file_list.append(os.path.join(root,filespath))
  return file_list
##########################合并同一个文件夹下所有PDF文件########################
def MergePDF(filepath,outfile):
  output=PdfFileWriter()
  outputPages=0
  pdf_fileName=getFileName(filepath)
  for each in pdf_fileName:
    print each
    # 读取源pdf文件
    input = PdfFileReader(file(each, "rb"))
    # 如果pdf文件已经加密,必须首先解密才能使用pyPdf
    if input.isEncrypted == True:
      input.decrypt("map")
    # 获得源pdf文件中页面总数
    pageCount = input.getNumPages()
    outputPages += pageCount
    print pageCount
    # 分别将page添加到输出output中
    for iPage in range(0, pageCount):
      output.addPage(input.getPage(iPage))
  print "All Pages Number:"+str(outputPages)
  # 最后写pdf文件
  outputStream=file(filepath+outfile,"wb")
  output.write(outputStream)
  outputStream.close()
  print "finished"
if __name__ == '__main__':
  file_dir = r'D:/course/'
  out=u"第一周.pdf"
  MergePDF(file_dir,out)
  time2 = time.time()
  print u'总共耗时:' + str(time2 - time1) + 's'

"D:\Program Files\Python27\python. exe" D:/PycharmProjects/learn2017/Merge multiple PDF files.py
D:/course/C1W1L01 Welcome.pdf
3
D:/course/C1W1L02 WhatIsNN.pdf
4
D:/course/C1W1L03 SupLearnWithNN.pdf
4
D:/course/C1W1L04 WhyIsDLTakingOff.pdf
3
D:/course/C1W1L05 AboutThisCourse.pdf
3
D :/course/C1W1L06 CourseResources.pdf
3
All Pages Number:20
finished
Total time taken: 0.128000020981s
Process finished with exit code 0

Related recommendations:

Python’s method of merging all txt files in the same folder

##

The above is the detailed content of Python example of how to merge all PDF files in the same folder. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn