Tuesday, January 20, 2026

Like X-Ray Vision For YouTube Content

While browsing the YouTube API, I came across a section on captions and it hit me: if I could slurp down a video's subtitles, I could feed them to an LLM like Gemini and get insights about the video without ever pressing play. I wondered if I could get answers to questions like: what's novel about this video? (besides a clever title and thumbnail) or does this video show signs of bias?

Digging into this, I quickly learned that the captions endpoint doesn't help solve this problem. It's designed for content creators to download the captions that they've created for their own videos. I wanted a way to grab subtitles for public videos.

But all isn't lost: with additional research, I discovered that you can use yt-dlp to grab a video's' subtitles. yt-dlp is a YouTube command line downloader, and one of its bonus features is downloading caption files. With this tool installed, my idea was trivial to put together.

First off, I added a -a subtitles option to youtubeassist, my command line tool for working with YouTube. I can now quickly grab the captions for any video I want. Here's this functionality in action:

  $ youtubeassist -a subtitles -i https://www.youtube.com/watch?v=dQw4w9WgXcQ | grep ^Never | head -4
  Never<00:00:43.440><c> going</c><00:00:43.520><c> to</c><00:00:43.760><c> give</c><00:00:44.079><c> you</c><00:00:44.480><c> up.</c><00:00:45.360><c> I'm</c><00:00:45.520><c> going</c><00:00:45.600><c> to</c>
  Never going to give you up. I'm going to
  Never going to give you up. I'm going to

Next, I tweaked geminiassist, my command line tool for working with Gemini. I added a -y video-id-or-url option. When -y is provided, the corresponding caption data is downloaded and included as an attachment with the prompt that's sent to Gemini. I was amazed how quickly all this came together.

Consider this video: This Journalling System Solved my Shower Thought Problem. Is it worth my time, or just more of the same? Now I can ask Gemini.

  $ geminiassist -a ask -y ABSV0GzSnS0 -p "Can you please look over the subtitles of this video. Can you summarize the video in a few bullet points. Give me a few bullet points why I should watch the video and a few bullet points why I should skip it. Generate results in HTML so I can embed them in a blog post."

Video Summary

  • The video outlines the creator's journaling system for turning ideas into reality, emphasizing the importance of tools and systems that reduce friction and encourage consistent creative action.
  • The creator recounts how losing an initial idea (a toothbrush with a flossing compartment) spurred the development of his current system.
  • Key components include a pocket journal (Moleskine) for capturing on-the-go ideas and a handmade pencil case filled with useful tools (markers, scissors, tape, ruler, etc.).
  • Digital tools are also key. He has a dedicated notebook, and mini printer to integrate photos and images into his journals, and a digital app (Notion) for organizing and refining ideas and creating a production pipeline.
  • The system prioritizes intentionality and visual representation over convenience, aiming for a harmonious process between idea and outcome.

Why You Should Watch This Video

  • Inspiration for creative systems: The video offers a detailed look into a creative professional's workflow.
  • Practical tool recommendations: Discover specific tools and applications (e.g., Moleskine, Jetream, Notion, mini printer) that enhance the journaling and creative process.
  • Emphasis on consistency: Learn the importance of building a routine around creative work, even if it's not always convenient.
  • Visual learning: The creator's emphasis on visual journaling is beneficial for visual learners and anyone seeking alternative methods of idea capture and development.

Why You Might Skip This Video

  • Not focused on digital tools only:The video has a large focus on a more hand-written and craft styled system. If you prefer pure digital solutions, you might find the physical tools section less relevant.
  • Individualized System: This is _one_ system, it does not mean that it will work for you and may require extra effort and experimentation to get your own system up.
  • Detailed Equipment List: The video goes into specific detail about all the tools which can be tedious to hear about if you're not interested in purchasing some of the tools the creator uses.
  • Personal Preference: The creator is very passionate about his current system, while passion can be great, the heavy emphasis on his personal system may not translate to value for all viewers.

Or, here's a provocative title that I'd normally skip as clickbait: America Just Entered Early-Stage Revolution. But now I can ask Gemini for its opinion before I waste my time and muddy up my feed.

  $ geminiassist -a ask -y '-7sDn_B2TbE' -p "Please review the subtitles for this video. Can you summarize it? Does the video show signs of bias? Is this clickbait or reasoned analysis? Provide your response in HTML bullet form so I can embed this easily in a blog post."
  • Summary: The video analyzes an incident in Minneapolis where Jake Lang, a known January 6th rioter and right-wing agitator, encountered resistance at a protest and then falsely claimed to be stabbed. The video argues that Lang's experience signifies a shift in power dynamics, where right-wing intimidation tactics are becoming less effective as people are physically pushing back against bigotry. It contrasts this with the Charlottesville events, when that intimidation was much more prevalent. The video concludes with cautious optimism, suggesting this could be the beginning of a real resistance movement, but warns that sustained, national, and habitual action is needed to bring on the revolution.
  • Signs of Bias: The video exhibits a clear bias against Jake Lang and the right-wing movement. The language used to describe Lang ("clown," "professional agitator," "racist white grievance movement") is strongly negative. The video frames the confrontation as a positive instance of physical resistance against "fascism". There are other instances of loaded language, such as calling Trump supporters "fascist" and "toxic."
  • Clickbait vs. Reasoned Analysis: This leans more towards reasoned analysis than clickbait, though elements of both are present. The video presents a specific argument, supports it with examples and comparisons (Minneapolis vs. Charlottesville), and draws a conclusion about broader social trends. However, the framing of the title and the language used likely aim to attract a specific audience and provoke an emotional response.

Like all things AI LLM, with an improved prompt, I could get improved responses. Still, even these hastily prepared queries show promise.

I'm not sure how I'll integrate this into my YouTube workflow, but I'm definitely psyched to add this tool to my toolbox.

Friday, January 16, 2026

In Praise Of msmtp - Painless Email Setup For Your AWS Linux Box

I recently kicked off a nano-sized AWS server to run a cron job. Out of the box, the server doesn't support sending email.

Back in the day, a local instance of sendmail would gladly deliver email on your behalf. This made sending email easy. A bit too easy, in fact, as it opened the door for spammers to do their thing. The feature that made this possible, and turned into a nightmare, was that email systems would simply trust the sender.

These days, you can still have sendmail deliver mail on your behalf, but most servers will reject it as they have no reason to trust the identity of your server.

In the AWS world, the next step is usually to turn to SES - Amazon's Simple Email Service. That is, you'd instruct the Linux box to deliver the messages to SES, and SES would manage matters of trust.

Using SES isn't hard, but I was curious if there was a lighter weight solution. A little searching turned up: msmtp.

msmtp is a lightweight mail delivery agent that will securely hand off messages to another server for delivery. In my case, it will deliver messages to my Google Workspace account, which will forward the message on to the recipient. Configuration is simple and the software has few dependencies.

For a full walkthrough of how to set this up, check out this informative post: Send emails from your terminal with msmtp . For a more terse setup description, check out this Gemini Conversation. To skip all the fluff and see the install recipe, check out the code below.

If you're on a Linux box and just want email to work, msmtp is your friend.

Setup Recipe

# 1. Install msmtp and the mail interface
sudo apt-get update && sudo apt-get install -y msmtp msmtp-mta bsd-mailx

# 2. Generate a password using Google's App Password facility

# 3. Create the global configuration file
cat <EOF | sudo tee /etc/msmtprc
defaults
auth           on
tls            on
tls_trust_file /etc/ssl/certs/ca-certificates.crt
logfile        /var/log/msmtp.log

account        gmail
host           smtp.gmail.com
port           587
from           ben@super-cool-company.com
user           ben@super-cool-company.com
password       super-secret-unguessable-password

account default : gmail
EOF

# 4. Setup log file and secure the configuration (600 permissions are required)
sudo touch /var/log/msmtp.log && sudo chmod 666 /var/log/msmtp.log
sudo chmod 600 /etc/msmtprc

# 5. Link the 'mail' command to msmtp
echo "set sendmail=/usr/bin/msmtp" | sudo tee -a /etc/mail.rc

# 6. Send test email
echo "How's it going?" | mail -s "Checking in" bob@aol.com

Wednesday, January 14, 2026

A Texas Sized Mystery: Solved

On a flight from New York City's LaGuardia to Washington's Reagan Airport I captured some photos of this curious sight:

Given its size and proximity to a main road, I figured I'd quickly identify it. But alas, trawling through Google Maps, chatting with ChatGPT and searching along FlightAware flight paths turned up nothing.

I finally gave up, and simply pronounced: I'd have to solve the mystery on our next flight from LGA to DCA.

My Dad wasn't having it. He took one look at the photo and explained, "you're looking for a quarry. It should be easy to find."

And a few days later he texted me with the answer!

I'd taken aerial photos of the Martin Marietta - Texas Quarry:

At first, I was like, "that's close Dad, but that can't be it. My photo includes a sort of racetrack type structure in the foreground, and I don't see that in the image."

To which he replied, and I'm paraphrasing here: zoom out.

Whoa! The "racetrack" was in fact the Maryland State Fair Grounds and was an exact match for the feature I was looking at.

He'd done it! He'd solved the mystery! Thanks, Dad!

When I asked him how he did it, he explained that he asked Gemini for a "list of quarries that can be seen from a plane on a flight from DCA to LGA" and then ran them down in Google Maps. Inspired by his effort, I tried again with Gemini.

I tried the prompt: "I took this photo on a flight from LGA to DCA. It appears to be a quarry. Any idea which quarry it is?" The first time I asked, I got a false positive. Trying again with the exact same image and prompt found it!

I can't believe the answer was there all along, I just had to know how to ask. I suppose this is the quintessential challenge when working with AI: knowing how to ask.

With the quarry identified, I was curious: (a) where did it get its name from, and (b) is it significant?

The name, Martin Marietta - Texas Quarry is easy to unpack. The quarry is currently owned by the company Martin Marietta. The "Texas Quarry" part comes from the fact that it's located in a Maryland village that used to be known as Texas. This location name has been in use since at least 1847.

As for (b), the quarry does appear to be significant. From what I can tell, marble from this site was used in the Washington Monument as well as other local sites. Here's how the USGS describes the marble that was used in the construction of the Washington Monument (emphasis is mine):

Three different kinds of marble were used in the construction of the Washington Monument, which was delayed by several problems. ... The first 152 feet of the monument, built between 1848 and 1854, is faced with marble from Texas, Md. Work stopped when funds ran out.

That sounds like my quarry, or at least one in the vicinity.

Another hint to the significance of this area, is this article published in 1874. It provides extensive details about the "Beaverdam Quarry", explaining that it was also known as the Cockeysville Quarries. The article proudly notes that "the National Capitol was built of marble from this quarry." It specifically calls out the "post-office building of Washington was also constructed of the same material."

Not only is the Texas Quarry located in Cockeysville, MD, but Beaverdam Road runs adjacent to the property. So while I can't say for sure the article is talking about the same quarry I photographed, it certainly seems to be in the same neighborhood.

What a gratifying mystery to solve! Thanks, Dad!

Tuesday, January 13, 2026

Building A Better Digital Bat Signal - Part 3 - The Phone

This is part 3 of a 3 part series on building an enhanced notification system when I have an unread urgent email. Part 1 describes the problem and Part 2 describes the server side code needed to power my solution. Part 3 implements the visual alert by having my phone react to messages sent by the server. Let's jump in.

Tasker, combined with AutoRemote, provides an elegant solution for responding to the inbox status messages that the server is issuing. Two short Tasks, and two short Profiles is all it takes to turn these messages into a hard-to-miss visual alert.

The main task is: SetInboxStatus; here's how it works:

%new_status is provided via a parameter to the Task. If the new status is unknown then the Task quits. In this case, we can't make any assumptions about what the status of my inbox is.

Otherwise, the task checks to see if %new_status matches the existing globally set status %INBOX_STATUS. If it does not, then we know we're dealing with a status-change. In this case, I invoke the SetWallpaper action with an image path that includes %new_status.

In other words, if the old status was ok and the new status is wife, then I set the wallpaper of my phone to: /storage/emulated/0/Tasker/images/inbox_status/wife.jpg.

Regardless of whether the status has changed, I note the time the status was set in %INBOX_STATUS_LAST_SET. That will come in handy in checking to see if it's been too long since my last status update.

The if condition in SetInboxStatus ensures that the phone only sets the background, a relatively expensive operation, if there's a change. This makes it safe to call SetInboxStatus with a non-changing status as often as the system wants.

The next Task is InboxStatus Watchdog. This task keeps an eye on %INBOX_STATUS_LAST_SET.

This value minus %TIMES is the number of seconds that have passed since a status was set. If this value gets to be larger than 1230 seconds (20.5 minutes), then it kicks in and sets the %INBOX_STATUS and background to unknown.

Next up, these Tasks need to be called by Tasker. This happens through two different Profiles.

The first profile depends on AutoRemote and looks for incoming messages with the regular expression ^SetInboxStatus.

There's a single action associated with this profile: run the task SetInboxStatus with the first parameter set to %arcomm. %arcomm is a magic variable that will be set to the right-hand side of the delimited message sent via AutoRemote. The server is sending messages in the format: SetInboxStatus=:=wife. In this case, %arcomm would be set to wife.

The watchdog profile is even simpler: it's set up as a Time profile that runs from 12am to 11:50pm and repeats every 5 minutes. Every time this profile runs, it invokes the task InboxStatus Watchdog.

All that remains is select the appropriate images for the different statuses. For the ok status I choose a standard background image. For the other status, I've asked Gemini for an assist, and created alert specific images.

When all is OK, my phone's home screen looks like so:

With an urgent message in my inbox, the background changes to:

The lock screen is also updated:

With nearly any interaction with my phone, an urgent alert will be painfully obvious.

I've had this system running for a little over a week now, including while we were on vacation in Hawai'i. While I have yet to receive an actual urgent message (hurray!), I did have the unknown status kick in when we were hiking in remote areas and my phone had no signal. When we returned to civilization, the ok status background would kick in. This gave me a delightful sense of assurance that even if I wasn't closely watching my email, a script I developed was.

You can grab the Tasker code for all of the above here. I'd still love to cook up a hardware based solution, but in the meantime, this Google API, AutoRemote, Tasker solution is working exceptionally well. You should give it a try!

Monday, January 12, 2026

Building A Better Digital Bat Signal - Part 2 - The Server

I want my phone's background to automatically update to a dramatic looking image when there's an urgent email in my inbox. To implement this, I need to scan my Gmail inbox for such a message and report this to my phone. I've implemented this behavior using a bit of shell scripting on a nano-sized AWS Linux server.

I can access my inbox via the Gmail API. Years ago, I wrote gmail_tool to manage an overrun Gmail inbox from the command line. I recently implemented a cleaned up version of this script, as gmailassist.

gmailassist lets me run a search for matching Gmail threads. Consider this search of my SPAM folder:

  $ gmailassist -p i2x -a threads -q 'label:spam' | head -3
  19baeee1f181be4b|沉默。 娘亲看起来好难过,姐姐也闷闷不乐的,可是为什么祖母跟别人都笑得这么开心涅。 “笑儿,娘舍不得你。”回屋的路上,连翘终是忍不住落下泪来,如果是京中那还好些,偏偏要去无双辣那么远的地方,不""知谷那无双云爷是什么样的人,会不会对笑儿好,嫁过去之后笑儿会不会受欺负。 越想,连翘心里越是难过,仿风已久可以看见慕容笑笑以后悲惨的生活。 “煮母,小姐受媓上赐婚,是件无上荣耀的事情,
  19ba891f8fae60ae|Standard Capital Can Help Apply for a Term Loan Now As a small business owner, you understand that having access to flexible and affordable funding is essential for your success. Standard Capital is
  19ba3a0df841439c|Confirmation for your recent purchase is attached,. 2026-01-10-Call Our Helpline: +1(983) 220-2512

With this command line tool in place, it's straightforward to generate a single 'inbox-status'. Here's the current logic for doing so:

  checks="work_urgent|work|label:inbox_label:unread_subject:urgent"
  checks="$checks personal_urgent|personal|label:inbox_label:unread_subject:urgent_category:primary"
  checks="$checks wife|personal|label:inbox_label:unread_from:wife@gmail.com"
  checks="$checks ok|work|ben" # [A]
  now=$(date +%Y-%m-%d)

  for check in $checks ; do
    status=$(echo $check | cut -d'|' -f1)
    profile=$(echo $check | cut -d'|' -f2)
    query=$(echo $check | cut -d'|' -f3 | tr '_' ' ')

    gmailassist -a threads -q "$query" -p $profile > $TMP/inbox-status.hits
    hits=$(cat $TMP/inbox-status.hits | wc -l)
    if [ "$hits" -gt 0 ] ; then
      echo "$now $status" >> $TMP/inbox-status.log
      if [ -n "$V" ] ; then
        cat $TMP/inbox-status.hits
      else
        echo $status
      fi
      exit
    fi
  done

  echo "$now unknown" >> $TMP/inbox-status.log
  echo "unknown" # [B]
  ;;

This bit of code loops through each $checks. Each 'check' has the format:

  • Status - the value that will be printed to the screen if the Gmail search returns any rows.
  • Profile - the Gmail profile to search. I've set up work and personal so I can get alerts from both my personal and work email.
  • Query - The search to use against Gmail. This looks odd because spaces are replaced with _. Other than this, however, this is a normal Gmail search.

In the first version of inbox-status I neglected to include the last check, [A]. I assumed that if none of the queries returned any rows, then all must be ok. The problem is, occasionally the API glitches. In this case, the searches return no rows not because they are empty queries, but because the API has failed. By setting up [A], the inbox status will only be OK if a search actively returns rows.

[B] is also essential, it says that if none of the rows match then the status is unknown. This will be skipped by the phone, so that when the API is down, the results are ignored.

Finally, I added support for a -v option when querying the inbox status. This reports the matching threads that correspond to the status. This is useful for quickly seeing which message has triggered the inbox status logic.

With a script to derive my inbox status, all that was left was to deliver this information to my phone. I make use of AutoRemote to accomplish this.

AutoRemote is a magic service that allows information to be delivered to an Android phone via a web request.

In my Linux server's crontab I have:

  */10 * * * * i2xassist -a inbox-status | andsend SetInboxStatus

andsend is little more than a simple wrapper around curl. It invokes the URL:

  $ status=$(i2xassist -a inbox-status)
  $ curl -s https://autoremotejoaomgcd.appspot.com/sendmessage \
    -d key=YOUR_AUTO_REMOTE_KEY \
    -d "message=SetInboxStatus=:=$status"

Every 10 minutes, my phone receives a SetInboxStatus message with the current inbox status. Up next is to have the phone react to this. We're almost there!