Wednesday, November 11, 2015

puburl: Instant Command Line Publishing of Any File

As I'm continuing to flesh out blurl, my command line blogging utility, I decided I might as well tackle the issue of embedding images. The solution I arrived at was to develop a small program that would take an arbitrary file (say, an image), publish it to a place on the web, and spit back out a URL. I could then use this URL in the markup that I hand to blur. Of course, like any Unix'y command line tool, this little script would be far more useful than just powering command line publishing.

One constraint that I wanted to put on this program was that it should be idempotent*. That is, uploading the same file multiple times would always result in the same URL.

After a bit of hacking, I ended up with: puburl. It works like so:

$ puburl P1790020.JPG
https://www.googledrive.com/host/0B53sMu55xO1GTGxreFg2SEoycjA
$ puburl P1790020.JPG
https://www.googledrive.com/host/0B53sMu55xO1GTGxreFg2SEoycjA
$ cp P1790020.JPG snow.jpg
$ puburl snow.jpg
https://www.googledrive.com/host/0B53sMu55xO1GTGxreFg2SEoycjA

Note that regardless of what the file is named, the URL to it always remains the same.

puburl is built off of gdrive a simple command line utility for working with Google Drive. gdrive runs equally well on Windows as it does on Linux, therefore, so does puburl. To setup puburl you'll need to have gdrive authentication configured. This sounds more confusing than it is. Basically, take a moment or two to setup gdrive and puburl will be just about ready to go.

puburl works by storing all its content in a Google Drive Folder of your choice. It also renames every file you upload to the md5 value of the file itself. That's how the idempotent nature is established.

Finally, it's worth noting that the shape of the URL gdrive and other web sources suggest is:

 https://drive.google.com/uc?id=[FILE_ID]
or
 https://drive.google.com/uc?id=[FILE_ID]&export=download

However, I found this URL to be a bit too smart for its own good. In many cases Google adds a Content-Type-Disposition header to the file being downloaded. This causes a Save As box to appear, which in a typical Drive context is a good thing. But for my purposes was annoying; I wanted the content served inline.

Looking around, I found discussions about how you can host a website via Google Drive (who knew?). Taking a cue from these resources, I realized that the following URL structure dispensed with the Content-Type-Disposition header.

 https://www.googledrive.com/host/[FILE_ID]

This seems to be exactly what I was looking for. I've now got a simple Unix command that turns files into URLs. Here's the code for puburl - Enjoy!:

#!/bin/bash

PARENT_FOLDER_ID=0B53sMu55xO1GaEp1UGlPWWVVaTA

if [ -z "$PUBURL_GDRIVE_CONFIG" ]; then
  echo "Missing setting of PUBURL_GDRIVE_CONFIG - set this to the path to your ~/.gdrive auth dir."
  exit
fi

GDRIVE="gdrive -c $PUBURL_GDRIVE_CONFIG"

if [ -z "$1" ] ; then
  echo "Usage: `basename $0` file1 [file2 ...]"
  exit
fi

for f in "$@" ; do
  if [ ! -f "$f" ] ; then
    echo "Cowardly refusing to upload [$f], a file that doesn't exist"
    continue
  fi

  md5=`md5sum "$f" | cut -d ' ' -f1`
  found=`$GDRIVE list -n -q "'$PARENT_FOLDER_ID' in parents and title = '$md5'"`
  if [ -z "$found" ] ; then
    id=`cat "$f" | $GDRIVE upload -s -t $md5 -p $PARENT_FOLDER_ID --share |grep ^Id | sed 's/Id: //'`
  else
    id=`echo $found | cut -d ' ' -f 1`
  fi
  echo "https://www.googledrive.com/host/$id"
done

*Yes, I've been waiting years to use idempotent in a blog post. Whooo! Mission accomplished.

No comments:

Post a Comment