Home >Technology peripherals >It Industry >Creating PDFs from Markdown with Pandoc and LaTeX
Core points
This article author Chris Ward explains how to convert Markdown files to PDFs using Pandoc and LaTeX for their open source board game Chip Shop. Game components are written using Markdown, and the game website is also generated by these files.
Pandoc (an open source markup conversion tool) and LaTeX (an document declaration and layout system) are used to generate PDFs from Markdown files. Despite their powerful capabilities, they cannot combine multiple PDFs onto a single page, so Ward uses the command line tool PDFJam to meet this requirement.
The author provides a detailed guide on how to install necessary dependencies (Markdown, Jekyll, Pandoc, LaTeX, PDFJam), and gradually introduces the build process, including generating PDFs from Markdown, creating LaTeX files, and using PDFJam to transfer cards Combine on one page.
The author's ideal workflow is to generate a PDF file while generating a website, rather than when the visitor requests it. This approach also allows the PDF card version to look different from the HTML page without using complex CSS rules.
If you have read some of my posts on SitePoint or elsewhere, you probably know I'm working on a board game. This game, called Chip Shop, allows you to run a computer company in the 1980s America.
As part of the project, I tried to open source the entire game as much as possible. After a few attempts, I decided to use Markdown as the basic framework for most game components (especially cards and instructions).
Since the game website uses Jekyll, the game website is generated from the Markdown file. I'm going to make a premium pre-boxed and self-printed version of the game, for which I need to generate a PDF from a Markdown file.
Target
My ideal workflow is to generate PDF files while generating the website, not when the visitor requests it. This excludes the option wkhtmltopdf that I usually use for PDF generation, as it is generating PDF from generated HTML. Another reason is that I want the PDF card version to look different from the HTML page, and Jekyll lacks any kind of "view mode" functionality to achieve this without using complex CSS rules.
Chip Shop game's card Markdown template file contains many Markdown pre-information fields for game mechanics, not all fields are used on every card. For easy printing, I need to put the cards on the A4 page as many times as possible—in this case, a 3×3 grid. Ultimately, the page needs to be printed on both sides, but I haven't implemented that yet.
Pandoc and LaTeX
Any web search that generates PDF solutions from Markdown will lead you on the path of Pandoc. Pandoc is an open source Swiss Army knife mark conversion tool that supports a wide variety of input and output mark formats.
To generate PDFs using Pandoc, LaTeX is required. LaTeX originated from the scientific research community and is a document declaration and layout system. Combined with Pandoc and LaTeX, we can use variables to generate PDFs from a series of Markdown files and support Markdown pre-information.
Despite the powerful Pandoc and LaTeX, I can't find any way to combine multiple PDFs (cards) onto a single page, especially when using variables in a Markdown file. After a lot of research, I chose PDFJam, a simple command line tool for this need.
Installing dependencies
You don't need extra Markdown software besides possibly needing an editor, there are a lot of editors, and I suggest you read some SitePoint articles to make your choice.
I will continue to use Jekyll to illustrate the build process in the examples taken from my game, but if you don't need a website, it's not a necessary part of PDF generation.
On my Mac, I installed Pandoc using Homebrew, but all operating systems have corresponding options.
There are many words about the best way to install LaTeX, depending on your needs or how you intend to use it. The full installation of its common tools and libraries may be close to 2GB, but for most purposes a minimum installation is sufficient. Read the project's download page to find the option that works best for you.
In this tutorial, we will use the xelatex engine because I use custom fonts. However, you can choose any engine that provides the specific features you need.
Depending on how you install LaTeX, you may have installed PDFJam. (Type which pdfjam in the terminal to check.) If you don't have it installed, look for installation details here.
Building process
After some consideration, I run a bash script running locally seems to be the best choice at the moment. There is a better way, but it works, and I can improve the process later on, transfer it to a continuous integration system or Git Hooks.
View bash scripts on GitHub.
Let's introduce this script step by step now.
<code class="language-bash">bundle install bundle update rm -dfr _site rm -dfr pod</code>
These commands ensure that the Ruby dependencies required by Jekyll to build a website are up to date, and we deleted any existing website and print folders.
<code class="language-bash">jekyll build mkdir -p pod/pdf/cards</code>
Next, we build the website and create a folder for the printed version of the card.
Let's create a folder containing each Markdown file PDF version:
<code class="language-bash">for filename in _cards/*.md; do echo $filename pandoc --from=markdown+yaml_metadata_block --template _layouts/cards.latex -o pod/pdf/cards/"$(basename "$filename" .md)".pdf --latex-engine=xelatex $filename done</code>
This script handles each Markdown file in the _cards directory, making sure to observe the Markdown pre-information field. Using the cards.latex template (which we will cover later), the correct LaTeX engine outputs a PDF with the appropriate name.
Most of the magic of generating card files from Pandoc happens in LaTeX templates.
View LaTeX templates on GitHub.
LaTeX is new to me, but it is not too complicated. I'll explain what I changed from the default LaTeX file (located in Pandoc_install_dir/data/templates/default.latex) to get the card to work properly. I recommend sharelatex.com for previewing them when editing LaTeX files.
<code class="language-bash">bundle install bundle update rm -dfr _site rm -dfr pod</code>
We need a specific page size and we will use the columns to show the cost and score of the card later. We are using graphics and custom fonts, so we need these packages.
We are trying to create a clear and concise simple layout. Here is how we implement it:
<code class="language-bash">jekyll build mkdir -p pod/pdf/cards</code>
I think a lot of the above is quite easy to understand for anyone who is used to code or tagging. We are creating elements of the card, aligning them, setting the font size and checking if there are values, and then outputting them so that the card does not end up with empty fields.
We resize the image to a specific size and center it. The cost and score values are arranged in two columns, set using the begin{tabular} command, and the number of columns is set using the number of l.
We use PDFJam to create a large PDF file with each individual PDF card:
<code class="language-bash">for filename in _cards/*.md; do echo $filename pandoc --from=markdown+yaml_metadata_block --template _layouts/cards.latex -o pod/pdf/cards/"$(basename "$filename" .md)".pdf --latex-engine=xelatex $filename done</code>
Use this command, we specify the following:
PDFJam may give an error if you are not outputting to its working directory, so I move the file to where I actually want it (hopefully it will be solved in the future). Here we can also delete a single PDF file if we don't need it.
That's it - we have a printable PDF of websites and game cards.
I use ./build.sh
to run the build script. Since there are a lot of images and PDF processing, it takes about five to ten minutes. Then I have a separate script to deploy these folders to the web server.
Next steps
This process took me a while to get it right, but it's good enough now to continue to improve the process and layout after the game test.
I hope you find my research and experiments useful to your project. If you have any comments or suggestions, please let me know.
FAQs (FAQs) about creating PDFs from Markdown using Pandoc and LaTeX
To install Pandoc, you can use it from the official website (https://www.php.cn/link/8f1dd6e7a88b9cf615c146330c591ba9.
Yes, you can use LaTeX templates to customize the appearance of the PDF. Pandoc uses the default template to generate PDFs, but you can specify your own templates using the --template
option. You can create your own templates or use one of the many templates available online, such as those found in the Wandmalfarbe Pandoc LaTeX template GitHub repository.
To convert a Markdown file to a PDF, you can use the following command in a terminal or command prompt: pandoc yourfile.md -o yourfile.pdf
. Replace yourfile.md
with the name of your Markdown file and yourfile.pdf
with the desired name of your PDF file. This command tells Pandoc to convert Markdown files to PDF using the default LaTeX template.
(The rest of the FAQ content is the same as the original text, omitted here to avoid duplication)
The above is the detailed content of Creating PDFs from Markdown with Pandoc and LaTeX. For more information, please follow other related articles on the PHP Chinese website!