Document writing with Markdown and LaTeX

Technical documents, such as academic reports or papers, are frequently written with the powerful and versatile TeX typesetting system, most commonly in the form of the LaTeX document preparation system. If you have a proper template and layout, using this system will produce great looking results.

But most of the time, I don’t need the full power of TeX (equations etc.) and I really cannot be bothered to write verbose constructs, such as \textbf{} or \texttt{} – because the source side of LaTeX surely is not so beautiful. I would much rather just write in plain markdown, like **bold**, *italic* or codeblocks with backticks.

For this scenario pandoc is the perfect tool. It is a swiss-army knife for document conversion, but specifically converting markdown documents into LaTeX and further to PDF is a great use-case (an example of how to do this). If you ever need to use more advanced LaTeX features, you can directly embed them in the markdown (e.g. for citations, figures or tables).

Furthermore, pandoc templates can be used to add some styling around these (initially) plain PDF documents. The Eisvogel template is one of my favorites and I have used it several times before for papers, reports etc. which don’t require a specific format.

For my master’s thesis however, I already have a fixed LaTeX template provided by the university (which is great of course, because I don’t have to create it myself!). These LaTeX templates can be quite finicky and brittle, so I was not really looking forward to porting the LaTeX template into a pandoc template.

The following three steps describe what I did to write the document body in markdown, preprocess the markdown with pandoc and generate the PDF with LaTeX.

Important remark: before doing all of this you should make sure the LaTeX template you are modifying actually compiles in its original form! Otherwise you might spend a lot time debugging what you did wrong, when in fact the original document was faulty.

#  Step 1: Remove the TeX document body

I skipped past the preamble in the provided LaTeX template, removed all the major sections from the main document and instead put them into individual files, so that only the following content remains in the main document (apart from preamble and document declarations):

1
2
3
4
5
6
%% Note: each include automatically produces a \clearpage
\include{include/01-introduction}
\include{include/02-background}
\include{include/03-research}
\include{include/04-implementation}
\include{include/05-conclusion}

Thus, we end up with the following directory structure:

├── include
│   ├── 01-introduction.tex
│   ├── 02-background.tex
│   ├── 03-research.tex
│   ├── 04-evaluation.tex
│   └── 05-conclusion.tex
└── thesis.tex

#  Step 2: Preprocess markdown to LaTeX

To transform markdown files into TeX files, I’m using the awesome pandoc:

# short form:
pandoc -o out.tex in.md
# long form (equivalent):
pandoc -f markdown -t latex -o out.tex in.md

After running this command for each file, the directory should look like this:

├── include
│   ├── 01-introduction.md
│   ├── 01-introduction.tex
│   ├── 02-background.md
│   ├── 02-background.tex
│   ├── 03-research.md
│   ├── 03-research.tex
│   ├── 04-evaluation.md
│   ├── 04-evaluation.tex
│   ├── 05-conclusion.md
│   └── 05-conclusion.tex
└── thesis.tex

For example, the file 01-introduction.md:

1
2
3
4
5
6
7
# Introduction

**Hello, world!**

## History

Lorem ipsum.

will be converted into 01-introduction.tex:

1
2
3
4
5
6
7
8
9
\hypertarget{introduction}{%
\section{Introduction}\label{introduction}}

\textbf{Hello, world!}

\hypertarget{history}{%
\subsection{History}\label{history}}

Lorem ipsum.

Note that pandoc conveniently also generates labels for each section - handy!

#  Step 3: Automate it with a Makefile

Obviously, I don’t want to run this manually for each file every time I change something, therefore in the next step I will roll it up into a Makefile.

The Makefile takes cares of any preprocessing steps (converting markdown to LaTeX and SVG to PDF), invoking latexmk to build the final PDF from LaTeX sources, managing all build dependencies and cleaning up, if necessary.

This Makefile is partially based on the Makefiles I have linked to in the references below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Makefile for Master's Thesis

## The output filename
TARGET    := thesis.pdf

## Input files
MD_SRC    := $(wildcard include/*.md)
TEX_SRC   := $(patsubst %.md,%.tex,$(MD_SRC))
RASTERIMG := $(wildcard images/*.png)
VECTORIMG := $(wildcard images/*.svg)
BIB       := include/references.bib


.PHONY: all clean distclean

## "all" should be the first (default) target in the makefile
all: $(TARGET)

## Note:
## '$@' is a variable holding the name of the target,
## and '$<' is a variable holding the (first) dependency of a rule.

## Produce final target from all input files
$(TARGET): $(TEX_SRC) $(RASTERIMG) $(patsubst %.svg,%.pdf,$(VECTORIMG)) $(BIB)


## Convert markdown source to LaTeX
%.tex: %.md
	pandoc -f markdown -t latex -o $@ $<

## Convert SVG vector graphics to PDF
%.pdf: %.svg
	inkscape -A $@ $<

## Generate PDF from LaTeX
%.pdf: %.tex
	latexmk -use-make -pdf -pdflatex="pdflatex -interactive=nonstopmode" $<

## Clean most things
clean:
	-latexmk -c
	rm -f *.aux *.idx *.ind *.out *.toc *.log *.bbl *.blg *.brf *.lof *.lot *.xmpdata
	rm -f include/*.aux include/*.tex

## Clean everything
distclean: clean
	latexmk -C
	rm -f *.pdf

Now I can just run make (or make all or make thesis.pdf) to generate my beautiful PDF and make will automatically detect which files changed and which parts need to be rebuilt.

Happy writing!

By the way: If you find yourself debugging your Makefile (like I had to do for quite a while), try with make -n and make --debug=implicit. If that is not enough, have a look at the references below.

#  References