Home >Technology peripherals >It Industry >Creating PDFs from Markdown with Pandoc and LaTeX

Creating PDFs from Markdown with Pandoc and LaTeX

Joseph Gordon-Levitt
Joseph Gordon-LevittOriginal
2025-02-19 09:48:09132browse

Creating PDFs from Markdown with Pandoc and LaTeX

Core points

This article author Chris Ward explains how to convert Markdown files to PDFs using Pandoc and LaTeX for their open source board game Chip Shop. Game components are written using Markdown, and the game website is also generated by these files.

Pandoc (an open source markup conversion tool) and LaTeX (an document declaration and layout system) are used to generate PDFs from Markdown files. Despite their powerful capabilities, they cannot combine multiple PDFs onto a single page, so Ward uses the command line tool PDFJam to meet this requirement.

The author provides a detailed guide on how to install necessary dependencies (Markdown, Jekyll, Pandoc, LaTeX, PDFJam), and gradually introduces the build process, including generating PDFs from Markdown, creating LaTeX files, and using PDFJam to transfer cards Combine on one page.

The author's ideal workflow is to generate a PDF file while generating a website, rather than when the visitor requests it. This approach also allows the PDF card version to look different from the HTML page without using complex CSS rules.

If you have read some of my posts on SitePoint or elsewhere, you probably know I'm working on a board game. This game, called Chip Shop, allows you to run a computer company in the 1980s America.

Creating PDFs from Markdown with Pandoc and LaTeX

As part of the project, I tried to open source the entire game as much as possible. After a few attempts, I decided to use Markdown as the basic framework for most game components (especially cards and instructions).

Since the game website uses Jekyll, the game website is generated from the Markdown file. I'm going to make a premium pre-boxed and self-printed version of the game, for which I need to generate a PDF from a Markdown file.

Target

My ideal workflow is to generate PDF files while generating the website, not when the visitor requests it. This excludes the option wkhtmltopdf that I usually use for PDF generation, as it is generating PDF from generated HTML. Another reason is that I want the PDF card version to look different from the HTML page, and Jekyll lacks any kind of "view mode" functionality to achieve this without using complex CSS rules.

Chip Shop game's card Markdown template file contains many Markdown pre-information fields for game mechanics, not all fields are used on every card. For easy printing, I need to put the cards on the A4 page as many times as possible—in this case, a 3×3 grid. Ultimately, the page needs to be printed on both sides, but I haven't implemented that yet.

Pandoc and LaTeX

Any web search that generates PDF solutions from Markdown will lead you on the path of Pandoc. Pandoc is an open source Swiss Army knife mark conversion tool that supports a wide variety of input and output mark formats.

To generate PDFs using Pandoc, LaTeX is required. LaTeX originated from the scientific research community and is a document declaration and layout system. Combined with Pandoc and LaTeX, we can use variables to generate PDFs from a series of Markdown files and support Markdown pre-information.

Despite the powerful Pandoc and LaTeX, I can't find any way to combine multiple PDFs (cards) onto a single page, especially when using variables in a Markdown file. After a lot of research, I chose PDFJam, a simple command line tool for this need.

Installing dependencies

Markdown

You don't need extra Markdown software besides possibly needing an editor, there are a lot of editors, and I suggest you read some SitePoint articles to make your choice.

Jekyll

I will continue to use Jekyll to illustrate the build process in the examples taken from my game, but if you don't need a website, it's not a necessary part of PDF generation.

Pandoc

On my Mac, I installed Pandoc using Homebrew, but all operating systems have corresponding options.

LaTeX

There are many words about the best way to install LaTeX, depending on your needs or how you intend to use it. The full installation of its common tools and libraries may be close to 2GB, but for most purposes a minimum installation is sufficient. Read the project's download page to find the option that works best for you.

In this tutorial, we will use the xelatex engine because I use custom fonts. However, you can choose any engine that provides the specific features you need.

PDFJam

Depending on how you install LaTeX, you may have installed PDFJam. (Type which pdfjam in the terminal to check.) If you don't have it installed, look for installation details here.

Building process

After some consideration, I run a bash script running locally seems to be the best choice at the moment. There is a better way, but it works, and I can improve the process later on, transfer it to a continuous integration system or Git Hooks.

View bash scripts on GitHub.

Let's introduce this script step by step now.

Settings

<code class="language-bash">bundle install
bundle update

rm -dfr _site
rm -dfr pod</code>

These commands ensure that the Ruby dependencies required by Jekyll to build a website are up to date, and we deleted any existing website and print folders.

Build a website

<code class="language-bash">jekyll build
mkdir -p pod/pdf/cards</code>

Next, we build the website and create a folder for the printed version of the card.

Generate PDF from Markdown

Let's create a folder containing each Markdown file PDF version:

<code class="language-bash">for filename in _cards/*.md; do
  echo $filename
  pandoc --from=markdown+yaml_metadata_block --template _layouts/cards.latex -o pod/pdf/cards/"$(basename "$filename" .md)".pdf --latex-engine=xelatex $filename
done</code>

This script handles each Markdown file in the _cards directory, making sure to observe the Markdown pre-information field. Using the cards.latex template (which we will cover later), the correct LaTeX engine outputs a PDF with the appropriate name.

LaTeX File

Most of the magic of generating card files from Pandoc happens in LaTeX templates.

View LaTeX templates on GitHub.

LaTeX is new to me, but it is not too complicated. I'll explain what I changed from the default LaTeX file (located in Pandoc_install_dir/data/templates/default.latex) to get the card to work properly. I recommend sharelatex.com for previewing them when editing LaTeX files.

<code class="language-bash">bundle install
bundle update

rm -dfr _site
rm -dfr pod</code>

We need a specific page size and we will use the columns to show the cost and score of the card later. We are using graphics and custom fonts, so we need these packages.

We are trying to create a clear and concise simple layout. Here is how we implement it:

<code class="language-bash">jekyll build
mkdir -p pod/pdf/cards</code>

I think a lot of the above is quite easy to understand for anyone who is used to code or tagging. We are creating elements of the card, aligning them, setting the font size and checking if there are values, and then outputting them so that the card does not end up with empty fields.

We resize the image to a specific size and center it. The cost and score values ​​are arranged in two columns, set using the begin{tabular} command, and the number of columns is set using the number of l.

Creating PDFs from Markdown with Pandoc and LaTeX

Combining cards on one page

We use PDFJam to create a large PDF file with each individual PDF card:

<code class="language-bash">for filename in _cards/*.md; do
  echo $filename
  pandoc --from=markdown+yaml_metadata_block --template _layouts/cards.latex -o pod/pdf/cards/"$(basename "$filename" .md)".pdf --latex-engine=xelatex $filename
done</code>

Use this command, we specify the following:

  • The page orientation should always be vertical
  • Each individual PDF should be framed
  • Grid size
  • File name suffix
  • File name

PDFJam may give an error if you are not outputting to its working directory, so I move the file to where I actually want it (hopefully it will be solved in the future). Here we can also delete a single PDF file if we don't need it.

That's it - we have a printable PDF of websites and game cards.

Creating PDFs from Markdown with Pandoc and LaTeX

Run script

I use ./build.sh to run the build script. Since there are a lot of images and PDF processing, it takes about five to ten minutes. Then I have a separate script to deploy these folders to the web server.

Next steps

This process took me a while to get it right, but it's good enough now to continue to improve the process and layout after the game test.

I hope you find my research and experiments useful to your project. If you have any comments or suggestions, please let me know.

FAQs (FAQs) about creating PDFs from Markdown using Pandoc and LaTeX

How to install Pandoc and LaTeX on my system?

To install Pandoc, you can use it from the official website (https://www.php.cn/link/8f1dd6e7a88b9cf615c146330c591ba9.

Can I customize the appearance of PDFs created using Pandoc and LaTeX?

Yes, you can use LaTeX templates to customize the appearance of the PDF. Pandoc uses the default template to generate PDFs, but you can specify your own templates using the --template option. You can create your own templates or use one of the many templates available online, such as those found in the Wandmalfarbe Pandoc LaTeX template GitHub repository.

How to convert Markdown files to PDF using Pandoc and LaTeX?

To convert a Markdown file to a PDF, you can use the following command in a terminal or command prompt: pandoc yourfile.md -o yourfile.pdf. Replace yourfile.md with the name of your Markdown file and yourfile.pdf with the desired name of your PDF file. This command tells Pandoc to convert Markdown files to PDF using the default LaTeX template.

(The rest of the FAQ content is the same as the original text, omitted here to avoid duplication)

The above is the detailed content of Creating PDFs from Markdown with Pandoc and LaTeX. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn