How I Turned My Substack Newsletter Archive Into a Searchable AI Knowledge Base
Writers: Don’t Let Your Newsletter Die in the Archives
I’ve written more than 150 issues of my newsletter over the past three years.
That’s hundreds of hours of work and thousands of words.
But here’s the problem:
Most of it was locked away in Substack, nearly impossible to access unless I wanted to scroll, search, and dig around manually.
I wanted a better way.
Specifically, I wanted to get all my newsletters into Google’s NotebookLM, an AI tool that lets you upload documents and then search, summarize, and generate new content from them.
The idea was simple:
Turn my writing into a searchable knowledge base I could use every day.
It turned out to be more complicated than I thought.
Substack doesn’t make it easy.
NotebookLM has limits.
And I’m not a coder.
But after some trial and error (and more than a little help from ChatGPT), I figured out a repeatable process.
Here’s how it worked.
Exporting from Substack
The first step was to get my data out of Substack. There’s a way to do this, but it’s not obvious.
You have to go to your Settings, scroll all the way down to Advanced, and click Import/Export. From there you request a new export, and after a few minutes Substack emails you a zip file.
Inside that file is everything you’ve ever done on Substack: posts, newsletters, and stats.
It’s messy.
The newsletters themselves are buried in a folder called “posts” and each one is saved as a Chrome HTML document.
Here’s the problem: NotebookLM allows you to upload PDFs, but not HTML documents.
I knew I would have to convert the HTML docs to PDFs.
So I sorted the files by Type, copied just the HTML documents, and moved them into their own folder in another spot on my computer.
Now I had all 150+ newsletters as individual HTML files, ready for the next step.
Converting HTML to PDFs
I said this earlier, but it’s worth repeating: NotebookLM accepts PDFs, not HTML docs.
That meant every single file had to be converted.
I could have opened each one manually and saved it as a PDF, but with more than 150 newsletters, that wasn’t an option.
I needed automation.
So I asked ChatGPT for help. Here’s what I asked:
“how hard would it be to build an automation that opens a Chrome HTML file, saves it as a PDF file with the same name as the HTML filename and saves it in a new folder?”
Here’s the response:
“Short answer: not hard. It’s a small automation (10–30 lines) if you already have Chrome installed. Below are three solid ways—pick your platform.”
Then, it gave me 3 Options for different platforms: Windows, Mac/OS, and Python.
I’m on Windows, so I used that script:
If you’re going to do this, I’d suggest you interact with ChatGPT yourself and let it create the script for you.
You can get the script that will work for your system and figure out any requirements based on your setup.
I’m showing the images here so that you see it’s a pretty-straightforward script that even non-programmers can handle:
This script requires you tell the program which folder your HTML documents are in. When I interacted with ChatGPT, it asked me to tell it the file path name and then updated the script.
Once you get your script updated, you save it as a file named “html-to-pdf.ps1” in the same folder as all your HTML documents. Don’t save it as a .txt file or any other type of file. Pick “All Documents” as the file type and hit Save.
Then you open PowerShell, which is already on your computer if you have Windows, and run the command ChatGPT gives you to run the script.
If you don’t know where to find PowerShell, type “PowerShell” into the search bar and it will pop up. It looks like this:
Copy the command ChatGPT gives you to run the script. Notice how ps1 file I saved is in the same folder where the HTML newsletters are stored (green text). Click Copy.
Go to PowerShell and Paste the code you Copied from ChatGPT (Ctrl-V).
Hit Enter.
The script will run and give you all sorts of warnings. I didn’t know what these were and it this was working or not, but I waited to see what would happen.
It took a few minutes to run. But when it was done, I had all my newsletters in a new folder called “PDF” in the same folder where all the HTML documents had been located. The file Type still says Chrome HTML Document, but when you open these, they open as PDFs. More importantly, I was able to upload them to NotebookLM as PDFs.
Merging PDFs into Fewer Files
At this point, I had 150+ PDF files. The next obstacle: NotebookLM’s free version only allows 50 files per notebook. There are options to upgrade at a cost, but I stayed with the free version.
That meant I couldn’t just upload everything as-is. I had to consolidate PDFs.
I used a tool called PDF Filler (because I happened to have a subscription I forgot to cancel during the free trial).
You could use Adobe Acrobat, SmallPDF, PDF2Go or any other merging tool. Every tool works differently, but the process was simple: take 10 PDFs at a time, merge them into one larger file, and export as a new combined file.
This should be pretty straightforward in whatever PDF tool you use so I’m not going into the details here.
When I was done, I had reduced 150+ PDFs down to about 17 combined files. Well under the 50-file cap.
Uploading to NotebookLM
Finally, I uploaded my merged files into a single notebook in NotebookLM.
You need a Google account, but if you are already signed in, you click “Create New” and it will prompt you to upload your sources. Remember, you’re limited to 50 sources.
In addition to PDFs, you can upload links to YouTube videos, websites, and Google Docs
Once you finish the upload, you can really start to see the magic:
In the Studio section, you can choose to have NotebookLM automatically create:
An audio overview: a podcast-style conversation summarizing the content in the sources you uploaded.
A video overview: For me, this was a six-and-a-half–minute narrated slideshow covering the core ideas of my newsletter. This is an amazing piece of evergreen content I can share right away and keep sharing forever. Check it out:
A mind map: a visual map of my main themes, topics, and ideas.
I love the Mind Map because it organizes all my topics and ideas into groups. Then when you click on one of the boxes, Notebook LM gives you a summary of the information with links back to the sources.
If I clicked on the box that says Dormant Ties in the Studio section, it brings up the summary of information in the Chat. And if I click on one of the numbers in the Chat, it takes me to the Source document, in this case the newsletter, where the information comes from.
You can also create a Briefing document, a study guide, or an FAQ in the Studio section. I like seeing what this creates because it can be used and quick drafts for future content on my topic.
With that, my entire newsletter archive became searchable and reusable. I could ask it to surface tips I’d forgotten, generate a list of tactics I’d written about, or even spin up a week’s worth of LinkedIn posts around a single theme.
Instead of digging through my own history, I suddenly had a tool that handed it back to me, organized and ready.
For future content creation, I will probably copy and paste the info from the Chat or the Briefing/FAQ documents into ChatGPT and then have it create the content instead of asking NotebookLM to create it.
NotebookLM is good for summarizing existing content, but doesn’t really take on new voices or personas like ChatGPT, or other LLMs like Gemini or Claude.
Why This Matters
This process turned years of writing into something alive again. What used to be a static archive is now a working library — a system I can use to create, repurpose, and share.
It wasn’t effortless.
Exporting from Substack was messy.
Converting HTML files took a bit of coding help.
Merging files was tedious.
But the result was worth it: my newsletter is now a searchable, AI-powered knowledge base I can draw from anytime.
If you’ve been publishing for a while, whether it’s a blog, a newsletter, or even internal company updates, you can do the same.
It’s a way to make sure all those hours you’ve invested don’t just sit in an archive.
They keep paying dividends.
I’m Greg, The Introverted Networker. I send a weekly newsletter to teach people to be more confident networkers. If that’s you consider subscribing.
If that’s not you, but you liked this article, feel free to follow me on Substack.
You never know when I might post something like this to help others make the most of their Substack publications.
Amazing 🔥
Thanks for the steps you shared.