How to Convert PDF Files into Images Using PDFBox
PDFBox, an Apache project, offers a powerful solution for converting PDF documents into individual images. This capability can be particularly valuable for tasks such as image extraction and processing.
The key to this conversion process lies in the PDDocument class, which serves as the foundation for accessing and manipulating PDF documents. Once a PDF is loaded into a PDDocument object, its pages can be accessed through the getAllPages() method.
Example Code
Here's an example demonstrating how to convert PDF pages into images:
Solution for PDFBox 1.8.*:
import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.rendering.ImageType; import org.apache.pdfbox.rendering.PDFRenderer; import org.apache.pdfbox.tools.imageio.ImageIOUtil; import java.awt.image.BufferedImage; import java.io.File; public class PdfToImageConverter { public static void main(String[] args) throws Exception { String pdfFilename = "your_pdf_file.pdf"; PDDocument document = PDDocument.loadNonSeq(new File(pdfFilename), null); List<PDPage> pdPages = document.getDocumentCatalog().getAllPages(); int page = 0; for (PDPage pdPage : pdPages) { ++page; BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300); ImageIOUtil.writeImage(bim, pdfFilename + "-" + page + ".png", 300); } document.close(); } }
Solution for PDFBox 2.0:
import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.rendering.ImageType; import org.apache.pdfbox.rendering.PDFRenderer; import org.apache.pdfbox.tools.imageio.ImageIOUtil; import java.awt.image.BufferedImage; import java.io.File; public class PdfToImageConverter { public static void main(String[] args) throws Exception { String pdfFilename = "your_pdf_file.pdf"; PDDocument document = PDDocument.load(new File(pdfFilename)); PDFRenderer pdfRenderer = new PDFRenderer(document); for (int page = 0; page < document.getNumberOfPages(); ++page) { BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB); ImageIOUtil.writeImage(bim, pdfFilename + "-" + (page + 1) + ".png", 300); } document.close(); } }
Solution for PDFBox 3.0:
import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.rendering.ImageType; import org.apache.pdfbox.rendering.PDFRenderer; import org.apache.pdfbox.tools.imageio.ImageIOUtil; import java.awt.image.BufferedImage; import java.io.File; public class PdfToImageConverter { public static void main(String[] args) throws Exception { String pdfFilename = "your_pdf_file.pdf"; PDDocument document = Loader.loadPDF(new File(pdfFilename)); PDFRenderer pdfRenderer = new PDFRenderer(document); for (int page = 0; page < document.getNumberOfPages(); ++page) { BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB); ImageIOUtil.writeImage(bim, pdfFilename + "-" + (page + 1) + ".png", 300); } document.close(); } }
By leveraging these code examples, you can effortlessly convert PDF documents into a series of individual images, enabling you to unlock the flexibility and convenience associated with image processing workflows.
The above is the detailed content of How do I convert PDF files into images using PDFBox?. For more information, please follow other related articles on the PHP Chinese website!