Could a Podcast Make Itself?
I run a daily podcast called The Weather in Brooklyn. Every morning, you can expect a new episode to appear in its feed. There’s an audio logo at the beginning, then some music starts playing, and the host begins to speak. The host presents the weather forecast for that day in Brooklyn, reads the credits, and signs off. The music fades out, and the episode ends.
But the host isn’t me. It’s not a person at all. And I don’t exactly run this podcast. It’s my podcast, to be sure, but I don’t make it every day. In fact, I’m usually still half asleep when new episodes are created.
What I did do was write a computer program that generates the podcast. The program puts together a script. It converts that script into spoken word audio. It selects music, and mixes the music with the voice. It uploads the episode audio, and it updates the podcast feed with the new episode.
In 2018 I was on a break between jobs and wanted to build something for myself. I’d written a bunch of Twitter bots over the years, so I considered building another. But the heyday of Twitter bots was waining. In the wake of the 2016 elections, Twitter had changed its API rules and moderation practices making them more hostile to bots—including silly, benign bots like mine. (Very recently, they’ve begun to change course.)
I searched for another platform that replicated what I loved about Twitter bots: the ability make something fun for the world to interact with that wouldn’t requiring me to do visual design or build interfaces. And what technology was I spending even more time using than Twitter? Podcasts. So that’s how I became obsessed with the idea of creating a podcast bot, a fully automated podcast that would sustain itself after I set it up.
I did some searching for podcast bots that already existed. I didn’t find any. Exciting! Maybe mine could be one of the first. But at the same time, if there weren’t many people doing this then it probably wasn’t easy to do. Now I wanted to build this thing just to see if I could.
(There is also another plausible reason why no one had built a podcast bot, which is that it was a bad idea that no one thought was worth investing in. I guess I’ll find out if that’s the case when I publish this!)
I broke ground in March 2018. It was definitely more complicated than I anticipated, but I kept at it and got most of the way there. And then my new job started, I got busy, and and in the summer of 2018 I put the project on hold…until November 2021.
At long last, it’s finally complete.
How it works
Here is a brief summary of how the The Weather In Brooklyn application works.
First, the obligatory stack breakdown: The application (which I will call “TWiB” going forward) is written in Ruby. It relies on National Weather Service data for weather forecasts, uses Amazon Polly for speech synthesis, SoX for audio manipulation, and S3 to host both the private assets the app uses and the public assets that comprise the podcast (podcast art and the RSS feed). TWiB runs in a Docker container and is deployed on Render, a relatively new cloud service. Render has great support for cron jobs, which is why I chose it over Heroku. (Side note: This was my first experience with Render and it was excellent soup to nuts.)
The entire TWiB codebase exists to support a single “run” command. There is no web service you can make requests to. The whole thing is just a glorified script that gets run by the cron job every day at 6 am. This is the sequence of operations the “run” command sets in motion:
- Pull forecast data for Brooklyn from the National Weather Service API.
- Build a script for the podcast in SSML format by combining the forecast information with canned copy and AI-generated commentary.
- Convert the script to speech using Polly speech synthesis. The voice is chosen at random among Polly’s English speaking voices that use the “neural” engine.
- Download audio assets from a private S3 bucket. TWiB selects a music track randomly from a folder of music, and downloads the audio logo.
- Mix the audio components—logo, music, and speech—into the final episode audio. This is accomplished via a bash script running SoX.
- Upload the episode audio to a public S3 bucket.
- Update the podcast RSS feed in the public bucket with the new episode data. This process is a little odd because there is no database of old episode data. The only record is the public RSS feed. TWiB starts to build a new feed by generating the outer
<channel>nodes fresh. (This picks up any code changes that might have happened since yesterday.) It then downloads the existing public feed, splices out the feed’s
<item>nodes (eaxh of which represent an epjsode), and inserts them into the new feed, followed by the newly generated
<item>node for that day’s episode.
That’s it. The RSS feed is served statically from S3 to all the podcast services.
Here’s a working list of features and extensions I’ve considered and might like to build. Please feel free to fork the project or open a pull request if you’re inspired!
- Starting with this application, it would only take a few hours to make, say, The Weather in Pittsburgh. In theory, one could make this process turnkey. A bot that makes podcast bots. One complication here is that you’d need new art, a new audio logo, and a new website for each location.
- The audio software I used (SoX) doesn’t have features that allow me to do “ducking”, which is the practice of lowering the volume of background music behind a voice. (If you don’t know what I mean, listen for this effect in any episode of This American Life.) I’d like to figure out a way to add ducking to make the spoken word easier to understand and to open up potential for more cacophonous music. I haven’t gone deep on this yet but it sounds like FFmpeg has some promising features.
- I’d love for the podcast art to change per episode and be generated programmatically based on the contents of the forecast. Joe Blubaugh did some awesome work in this direction back in 2018, but I wasn’t able to incorporate it.
- Heck, what if the music were also generated on the fly based on the forecast?
- It would be cool if someone designed proper podcast art instead of just me messing around in MacOS Preview.
Lots of people contributed to this project.
The audio logo (that vocoder voice saying the podcast title at the top) is courtesy of my buddy, the talented composer Nate Heller. For a good time, follow Nate on Twitter.
Thanks to Jascha and Nate, and everyone else who provided feedback on the project.
Obligatory CTA: Subscribe, rate, and review!
Here are some of the places you can find The Weather in Brooklyn: