Wednesday, August 01, 2012

An Even Easier Approach to PDF Generation in PHP

My usual approach to PDF generation in PHP is to create the document using FPDF. It's free, powerful and flexible enough to do whatever I want.

Yesterday, however, one of my clients came to me with a request where FPDF looked like it wasn't going to work: to fill in a Government PDF form on the fly. Sure, I could have tried to replicate every detail of the form using FPDF, but that was going to be time consuming to say the least. Nope, I needed an alternate approach.

After a bit of research, I found the tool that would save the day: pdftk: PDF Toolkit. This unix-style tool does a number of utility tasks with PDFs: merges and splits them, adds backgrounds and foregrounds, and for my purposes, fills in forms to generate a new PDF.

Installing pdftk would have been tricky, but I managed to find these CentOS instructions, so I was good to go.

The strategy for using pdftk to fill in forms on the fly that I used was this:

1. Use pdftk form.pdf generate_fdf to generate an initial FDF. The form filling process requires two things: (a) a PDF form and (b) the input data, which is provided in a FDF or XFDF format. FDF is a fancy way of packing name/value pairs. For this to work, you of course need to know the exact field names used in the PDF. By generating and examining an FDF document using the above command, you can learn just this.

2. Generate an XFDF file on the fly. I suppose you could generate an old school FDF document, but I think the XML based XFDF document is easier to understand. It should have the shape:

<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
  <fields>
    <field name='Name1'>
      <value><![CDATA[Value1]]></value>
    </field>
    <field name='Name2'>
      <value><![CDATA[Value2]]></value>
    </field>
    ...
  </fields>
</xpdf>

Note: to turn on a checkbox, set the field value to 'Yes.' To turn it off, leave it blank.

3. Call pdftk using shell_exec with the right arguments. Something like the following chunk of PHP should do the trick:

  $command = A(PDFTK_CMD, $pdf_path, 'fill_form', $fpf_path, 'output', $out_path);
  if($flatten) {
    $command[] = 'flatten';
  }

  $cmd = join(' ', $command);
  $output = shell_exec($cmd);

The optional 'flatten' argument turns controls whether the resulting form should be editable or not. This is a cool way to either leave the form tweakable by humans, or to make it appear as though it's a filled in PDF that can't be changed.

While I love FPDF, I've got to admit that the above approach really should be an easier route to take if the document allows for it. Rather than starting with a blank canvas and trying to build a sophisticated PDF, you can hand create one and just plug in the relevant bits on the fly. Definitely worth considering next time you've got a PDF to create dynmically.

No comments:

Post a Comment