Friday, June 03, 2022

49 Days in One - In Praise of Programmatic PDF Generation

My shul's Omer Learning project is wrapping up. This year we used the 49 day counting of the Omer to learn about the topic of Shmeita, the surprisingly progressive seven year cycle called for by the Torah.

The Omer Learning project works by collecting up short submissions on a topic and then sharing them out as blog, e-mail and social media posts on a nightly basis. The idea is to both mark the daily count of the omer, as well as learn a little something along the way. (Note: if all you want to do is count the omer, no project beats

As the project closes out, I want to send out a summary of all the days so readers could see any they missed or review any that were especially inspiring. Posting them all to a single blog entry would be excessive. I considered crafting a simple single-page-website that would host the content, but that seemed overly complex and would still call on people to click around to read all the entries.

Let's Build a PDF

Ultimately, I decided on creating a PDF that would serve as a stand alone record of all submissions. I suppose the conventional way to generate this kind of document would be to copy and past each submission into a Word or Google Doc, and format each of them by hand. But as a programmer, there was no way I was going that route. Instead, I decided I'd rely on one of my favorite PHP packages: fpdf.

Fpdf allows for the creation of PDF files pragmatically via PHP. I started with two .csv files, one containing the daily submissions and the other containing a list of the names of those who contributed. I then ran them through my make_book.php script to generate this output:

# make book.pdf
$ php -f make_book.php

# Bonus: make the cover image shown below. Thanks SO.
$ convert -density 300 -depth 8 -quality 90 -background white -alpha \
     remove -alpha off book.pdf[0] cover.jpg

PDF Generation Challenges and Solutions

You can find the PHP code that generates this document here. While the code didn't take long to write, I did solve a number of interesting and common challenges along the way.

  • How can I include custom fonts in the document? Download them from the web and generate the appropriate font files by using fpdf's makefont.
  • How can I switch fonts in a document without getting confused? Implement the withStyle(...) function that sets a preferred style, executes arbitrary code and then sets the font and color back to what they were.
  • How can I pull data in from CSV files? fgetcsv.
  • How can I add a gray border to emphasize submitted content? Make use of withIndent(...) and Line(...).
  • How can I deal with gibberish characters being inserted into the document? I cheated for this one, opting for an on the fly str_replace from fancy quotes and apostrophe to simple versions. A better solution would have been to figure out how to get those nice looking artifacts to be properly rendered.
  • How can I ensure that a day's entry isn't split across a page? Make use of withSmartBreak(...) that takes in a function that lays out content on a PDF page. First, I execute this on an in-memory scratchpad document and measure how much space was consumed. Then I compare that to the current page location and see if the content will fit. If not, I add a new page.
  • What do I do with hyperlinks in the content? Use the API to convert all URLs to shortened versions. I then include those compact links in the output. Fpdf also makes it trivial to link text to a URL. The result is that if you're looking at the PDF on a device, or in printed form, you can follow URLs with relative ease.

If you want to see these solutions in detail, check out the source code. I couldn't be more pleased with how this all turned out. The document looks sharp and the process of creating it was painless. And no copying and pasting of content was ever needed.

Download It

Here's the generated PDF: Omer Learning 2022: 49 Days to a Greener and More Equitable Community. Come for the lessons on fpdf, stay for the wisdom of Shemitah.

No comments:

Post a Comment