search
HomeWeb Front-endFront-end Q&AHow to convert html to pdf in Java

In recent years, with the continuous advancement of the digitalization process, the demand for electronic documents has become higher and higher. In actual work, we often need to convert HTML files to PDF files, and in this process we need to use Java programming technology. This article will introduce the Java implementation method of converting HTML to PDF from the following three aspects:

1. Use iText to convert HTML to PDF

iText is a popular Java PDF library that can convert HTML to PDF. Convert the file to a PDF file. iText parses HTML files and reconstructs the page using PDF markup language. The following is the key code for using iText to convert HTML to PDF:

Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("output.pdf"));
document.open();
HTMLWorker htmlWorker = new HTMLWorker(document);
String html = "<p>Hello World</p>";
htmlWorker.parse(new StringReader(html));
document.close();

The above code creates a Document object for generating PDF files, and then uses PDFWriter to write the Document object into the output stream to generate PDF files. The HTMLWorker is then used to parse the HTML document and add it to the PDF page. Finally, close the Document object to complete the generation of the PDF file.

2. Use Flying Saucer to convert HTML to PDF

Another Java tool that can be used to convert HTML to PDF is Flying Saucer. It is a free and open source PDF renderer that can convert HTML to PDF format documents. The following is a sample code for using Flying Saucer to convert HTML to PDF:

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = documentBuilderFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(htmlContent)));
ITextRenderer iTextRenderer = new ITextRenderer();
iTextRenderer.setDocument(document, null);
iTextRenderer.layout();
OutputStream outputStream = new FileOutputStream("output.pdf");
iTextRenderer.createPDF(outputStream);
outputStream.close();

The above code first parses the HTML document and reads it into Document. Then, use the ITextRenderer's layout() method to lay out the document. Finally, use the createPDF() method to generate the PDF file into the outputStream.

3. Use PDFBox to convert HTML to PDF

PDFBox is a popular open source Java PDF library that provides many tools for creating and processing PDF files. It also provides some HTML to PDF sample code, the complete sample code can be seen here.

The following is a sample code for using PDFBox to convert HTML to PDF:

PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
PDRectangle mediaBox = page.getMediaBox();
float margin = 72;
float startX = mediaBox.getLowerLeftX() + margin;
float startY = mediaBox.getUpperRightY() - margin;
float width = mediaBox.getWidth() - 2 * margin;
String html = "<p>Hello World!</p>";
ByteArrayInputStream bais = new ByteArrayInputStream(html.getBytes());
InputStreamReader isr = new InputStreamReader(bais);
COSDocument cosDoc = new COSDocument();
PDFOperator.reset();
PDPageTree pageTree = new PDPageTree();
PDDOMParser parser = new PDDOMParser(cosDoc);
parser.parse(isr);
PDDocumentOutline outline = new PDDocumentOutline();
document.getDocumentCatalog().setDocumentOutline(outline.getRootNode());
PDOutlineItem item = new PDOutlineItem();
item.setTitle("PDFBox");
PDOutlineItem childItem = new PDOutlineItem();
childItem.setTitle("Hello World 2");
item.addLast(childItem);
outline.getRootNode().addLast(item);
PDAcroForm form = new PDAcroForm(cosDoc);
document.getDocumentCatalog().setAcroForm(form);
PDPageContentStream cs = new PDPageContentStream(document, page);
PDFTextStripper stripper = new PDFTextStripper();
stripper.setStartPage(0);
stripper.setEndPage(1);
String text = stripper.getText(document);
cs.beginText();
cs.setFont(PDType1Font.COURIER, 14);
cs.drawString(text, 100, 100);
cs.endText();
contentStream.close();
document.save("output.pdf");
document.close();

The above code first creates a PDDocument object and adds a new page to it. Then, a PDPageContentStream object is created that is used to draw content on the page. Next, use PDDOMParser to parse the HTML into a COSDocument object. Finally, the content is written to the output stream to generate a PDF file.

Summary

HTML to PDF has a very wide range of applications in the actual production process, and this important task can be easily completed through Java programming. This article introduces how to convert HTML to PDF using three tools: iText, Flying Saucer and PDFBox. Whatever the situation, development can be made faster and more convenient by choosing the method that best suits your project needs.

The above is the detailed content of How to convert html to pdf in Java. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
What is useEffect? How do you use it to perform side effects?What is useEffect? How do you use it to perform side effects?Mar 19, 2025 pm 03:58 PM

The article discusses useEffect in React, a hook for managing side effects like data fetching and DOM manipulation in functional components. It explains usage, common side effects, and cleanup to prevent issues like memory leaks.

Explain the concept of lazy loading.Explain the concept of lazy loading.Mar 13, 2025 pm 07:47 PM

Lazy loading delays loading of content until needed, improving web performance and user experience by reducing initial load times and server load.

What are higher-order functions in JavaScript, and how can they be used to write more concise and reusable code?What are higher-order functions in JavaScript, and how can they be used to write more concise and reusable code?Mar 18, 2025 pm 01:44 PM

Higher-order functions in JavaScript enhance code conciseness, reusability, modularity, and performance through abstraction, common patterns, and optimization techniques.

How does currying work in JavaScript, and what are its benefits?How does currying work in JavaScript, and what are its benefits?Mar 18, 2025 pm 01:45 PM

The article discusses currying in JavaScript, a technique transforming multi-argument functions into single-argument function sequences. It explores currying's implementation, benefits like partial application, and practical uses, enhancing code read

How does the React reconciliation algorithm work?How does the React reconciliation algorithm work?Mar 18, 2025 pm 01:58 PM

The article explains React's reconciliation algorithm, which efficiently updates the DOM by comparing Virtual DOM trees. It discusses performance benefits, optimization techniques, and impacts on user experience.Character count: 159

What is useContext? How do you use it to share state between components?What is useContext? How do you use it to share state between components?Mar 19, 2025 pm 03:59 PM

The article explains useContext in React, which simplifies state management by avoiding prop drilling. It discusses benefits like centralized state and performance improvements through reduced re-renders.

How do you prevent default behavior in event handlers?How do you prevent default behavior in event handlers?Mar 19, 2025 pm 04:10 PM

Article discusses preventing default behavior in event handlers using preventDefault() method, its benefits like enhanced user experience, and potential issues like accessibility concerns.

What are the advantages and disadvantages of controlled and uncontrolled components?What are the advantages and disadvantages of controlled and uncontrolled components?Mar 19, 2025 pm 04:16 PM

The article discusses the advantages and disadvantages of controlled and uncontrolled components in React, focusing on aspects like predictability, performance, and use cases. It advises on factors to consider when choosing between them.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.