Tuesday, March 24, 2015

gdget: Inching Towards a more Linux Friendly Google Drive Solution

My default document strategy is to stash everything in Google Drive. If a client sends me a Word Doc, it's getting uploaded and converted to a Google Doc so we can both discuss it as well as have a reliable permanent record. This especially makes senses on Windows where I gravitate towards browser friendly solutions. On Linux, however, I'd prefer an emacs / command line friendly solution; and while Google Docs and Google Drive are accessible from Linux, the situation isn't exactly ideal.

What I truly want is ange-ftp for Google Docs. If you've never used emacs's remote file editing capability, you're missing out. It's like magic to pull in and edit a file on some random Linux box halfway across the world.

But I digress. While full editing support of Google Docs from within emacs woulds be ideal, a much simpler solution would suffice: if I could export a snapshot of various documents from Google Drive and store them locally, I could use emacs and other command line tools (I'm looking at you grep) to quickly access them. In other words, I'd gladly trade the browser and rich formatting for a simple, yet occasionally out of date, text file. And for the times when I need rich formatting, I'm glad to access Google Drive using a browser.

The good news is that the Google Drive API offers a download option which does this conversion to basic text. I just needed a tool that would execute this API. Google CL was a very promising initial solution. I was easily able to build a wrapper script around this command to pull down a subset of docs and store them locally. The big catch I found was that Google CL's title matching algorithm meant that attempts to download a specific file pulled down more than I wanted (trying to download "Foo" also pulled down "Foo Status" and "Foo Report"). More than that, it only worked with Google Docs; spreadsheets and other resources weren't accessible.

So I decided to try my hand at writing my own little client. The Google Drive API is actually quite HTTP GET friendly. The only catch (as usual) is authentication. Gone are the days where I could even consider being sloppy and putting my username and password in a shell script (I suppose a Thank You is in order for this). I was going to have to go the route of oauth.

I followed the instructions for an installed app (including setting up an app in my Google Developer's console) and while they initially looked daunting, everything came together without issue. I ended up hacking together 3 scripts:

  • gdauth - This script sets up the initial Google OAuth authentication. It also supports providing the access_token as needed.
  • gdget - This is a very thin wrapper around curl. It essentially adds the correct Authorization header to an arbitrary request and lets curl do the rest of the work
  • gdocs - A simple shell script for pulling down various docs and sheets and storing them locally. It works by associating various names (Foo) to document ID's (which is visible while editing a document within Google Docs).

gdauth and gdget both take a -c flag which sets a context. The context allows you to access multiple Google accounts. For example, you may authenticate with your work account using -c work or your personal account as -c personal. This way you can access various Google Drive accounts with a minimum of hassle.

You can use the low level tools like so:

$ gdauth -c blog init
https://accounts.google.com/ServiceLogin?....
Code? [enter code shown after visiting the above URL]
Done

$ gdget -c blog 'https://docs.google.com/a/ideas2executables.com/spreadsheets/d/1zjkVhrv-f0nvSOFNGuyEPJQSlSp-fXV66DYUFNJxGv8/export?exportFormat=csv' | head -4
,,,,
,#,Use,Reference,HTML
,1,for writing,,<li>for writing</li>
,2,as a straw,,<li>as a straw</li>
,3,"as a toy ""telescope"" for kids",,"<li>as a toy ""telescope"" for kids</li>"

For regular usage, I invoke gdocs pull from cron every few hours and I'm good to go.

Note that you're not limited to using the above tools to only download snapshots of files. You can access any part of the Google Drive REST API. For example:

 gdget -c blog 'https://www.googleapis.com/drive/v2/about'
 gdget -c blog 'https://www.googleapis.com/drive/v2/changes'
 gdget -c blog 'https://www.googleapis.com/drive/v2/files/0B53sMu55xO1GN3ZObG1HdzhXdXM/children'

Below are all three scripts. Hopefully you'll find them inspirational and educational. Better yet, hopefully someone will point me to an emacs or command line solution that makes this script look like the toy that it is. For now though, it's one heck of a useful toy.

# ------------------------------------------------------------------------
# gdauth
# ------------------------------------------------------------------------
#!/bin/bash

##
## Authenticate with Google Drive
##
USAGE="`basename $0` {auth|refresh|token} ctx"
CTX_DIR=$HOME/.gdauth
CLIENT_ID=__GET_FROM_API_CONSOLE__
CLIENT_SECRET=__GET_FROM_API_CONSOLE__

ctx=default

function usage {
  echo "Usage: `basename $0` [-h] [-c context] {init|token}"
  exit
}

function age {
  modified=`stat -c %X $1`
  now=`date +%s`
  expr $now - $modified
}

function refresh {
  refresh_token=`cat $CTX_DIR/$ctx.refresh_token`
  curl -si \
       -d client_id=$CLIENT_ID \
       -d client_secret=$CLIENT_SECRET \
       -d refresh_token=$refresh_token \
       -d grant_type=refresh_token \
       https://www.googleapis.com/oauth2/v3/token > $CTX_DIR/$ctx.refresh
  grep access_token $CTX_DIR/$ctx.refresh | sed -e 's/.*: "//' -e 's/",//' > $CTX_DIR/$ctx.access_token
}

while getopts :hc: opt ; do
  case $opt in
    c) ctx=$OPTARG ;;
    h) usage ;;
  esac
done
shift $(($OPTIND - 1))

cmd=$1 ; shift

mkdir -p $CTX_DIR
case $cmd in
  init)
    url=`curl -gsi \
         -d scope=https://www.googleapis.com/auth/drive \
         -d redirect_uri=urn:ietf:wg:oauth:2.0:oob \
         -d response_type=code \
         -d client_id=$CLIENT_ID\
         https://accounts.google.com/o/oauth2/auth | \
      grep Location: | \
      sed 's/Location: //'`
    echo $url | xclip -in -selection clipboard
    echo $url
    echo -n "Code? "
    read code
    curl -s \
         -d client_id=$CLIENT_ID \
         -d client_secret=$CLIENT_SECRET \
         -d code=$code \
         -d grant_type=authorization_code \
         -d redirect_uri=urn:ietf:wg:oauth:2.0:oob \
         https://www.googleapis.com/oauth2/v3/token > $CTX_DIR/$ctx.init
    grep access_token $CTX_DIR/$ctx.init | sed -e 's/.*: "//' -e 's/",//' > $CTX_DIR/$ctx.access_token
    grep refresh_token $CTX_DIR/$ctx.init | sed -e 's/.*: "//' -e 's/"//' > $CTX_DIR/$ctx.refresh_token
    echo "Done"
    ;;
  token)
    if [ ! -f $CTX_DIR/$ctx.access_token ] ; then
      echo "Unknown context: $ctx. Try initing first."
      exit
    fi
    age=`age $CTX_DIR/$ctx.access_token`
    if [ $age -gt 3600 ] ; then
      refresh
    fi
    cat $CTX_DIR/$ctx.access_token
    ;;
  *)
    usage
esac

# ------------------------------------------------------------------------
# gdget
# ------------------------------------------------------------------------

#!/bin/bash

##
## Run a GET request against an authorized Google
## URL.
##
CTX_DIR=$HOME/.gdauth
ctx=default

function usage {
  echo "Usage: `basename $0` [-c ctx] [-h] url"
  exit
}

while getopts :hc: opt ; do
  case $opt in
    c) ctx=$OPTARG ;;
    h) usage ;;
  esac
done
shift $(($OPTIND - 1))

if [ ! -f $CTX_DIR/$ctx.access_token ] ; then
  echo "Unknown context: $ctx. Try init'ing first."
  exit
fi

token=`gdauth -c $ctx token`

curl -s -H "Authorization: Bearer $token" $*
     
# ------------------------------------------------------------------------
# gdocs
# ------------------------------------------------------------------------
#!/bin/bash

##
## A tool for experimenting with Google Docs
##

GDOCS=$HOME/gdocs

function grab_all {
  url=$1    ; shift
  fmt=$1    ; shift
  for t in $* ; do
    name=`echo $t | cut -d: -f1`
    docid=`echo $t | cut -d: -f2`
    echo "$name.$fmt"
    u=`printf $url $docid $fmt`
    gdget -c i2x $u > $GDOCS/$name.$fmt
  done
}

##
## docs and sheets have the format:
##  LocalFileName:Docid
## Ex:
##  StatusReport:14SIesx827XPU4gF09zxRs9CJF3yz4bJRzWXu208266WPiUQyw
##  ...
##

docs=" ... "
sheets=" ... ""

commands='pull'
cmd=$1 ; shift

case $cmd in
  pull)
    grab_all "https://docs.google.com/feeds/download/documents/export/Export?id=%s&exportFormat=%s" txt $docs
    grab_all "https://docs.google.com/a/ideas2executables.com/spreadsheets/d/%s/export?exportFormat=%s" csv $sheets
    ;;
  *)
    echo "Usage: `basename $0` {$commands}"
    exit
    ;;
esac

3 comments:

  1. have you looked at InSync? it works well. yes, it's not free. but it is not expensive either.

    ReplyDelete
  2. Mark -

    Thanks for the lead, I'll check it out.

    ReplyDelete
  3. For an open source Linux client for Google Drive, there's grive

    http://www.lbreda.com/grive/start

    ReplyDelete