Convert Markdown to PDF with Pandoc

September 17, 2024, 12:05 pm

I host my CV on my personal website here and of course I always wanted to generate a PDF version of this page, but I could never get the image right so until now I simply deleted the image from the markdown before processing the markdown into PDF.

After some Google-Fu I finally solved this issue.

Pandoc

This tool is powerful but complicated, because it’s a wrapper for many other tools like pdfroff and latex.

This is the command I finally settled for in my pipeline.

sed '1 { /^+++/ { :a N; /\n+++/! ba; d} }' content/cv.sv.md | \
    sed -E 's/\{\{< figure src="([^"#]+)#.+" .+ >\}\}/![]\(\1\)/' | \
    pandoc --resource-path='.:static' -f markdown-implicit_figures -t latex --toc-depth=1 - -o public/cv.sv.pdf

The first sed command is only to remove the front matter from the markdown, pandoc won’t understand what it is and it’ll be visible in the PDF otherwise.

The second sed command used to delete the line with my hugo partial <figure> tag, but now it converts the tag to a normal markdown image tag, which pandoc can understand. This sed is very specific to me as it searches for my particular way of linking this image.

Lastly the pandoc command uses the input format extension markdown-implicit_figures which I found was the only format that could handle the image correctly. Pandoc also needs a path to your image resources using --resource-path, in this case it’s the static dir. Finally I set the output format to latex with the -t latex argument.

Cli
Website