Google’s NotebookLM AI model can create lifelike voice conversations around user-uploaded content.
Credit: Roger Dooley
Could podcasters be the next profession to be replaced by AI? What about radio and TV commentators?
When you look at the terrible AI-generated text articles and useless chatbots we see today, you’re probably saying, “No way!” But before you dismiss that notion, let me share the results of an experiment I ran.
Spoiler: we’re not quite there yet, but we’re a lot closer than you might think.
Google’s NotebookLM AI model
In the last few days, I’ve been trying out NotebookLM, an experimental AI model from Google, which is a RAG-locked model and only uses information you upload, making it less prone to errors and hallucinations than ChatGPT, Claude, and other AI models.
For example, I uploaded my books “Friction” and “Brainfluence” to a notebook and asked NotebookLM to compare them. Within seconds, it produced a compelling comparison of the content and style of the two titles. Its analysis proved to be accurate, and it was able to extract citations without fabricating any.
Instant “podcast” creation
Here’s a surprising fact: Google recently added a button to NotebookLM to create a podcast-like audio conversation between two people. Not a “human” exactly, but an AI voice. There are no controls or settings, just a “Generate” button. It uses whatever you have uploaded to that particular notebook – articles, research papers, books, etc.
The results were surprising.
After a short wait, we were greeted with a 15-minute conversation about the idea of Friction. The script and audio pairing was excellent. The language was casual and conversational. The two speakers were chatting back and forth between each other, occasionally interrupting and adding thoughtful “um”s. Most importantly, they sounded completely human. Listening to the short samples, it’s hard to imagine it was an AI.
I continued experimenting and created a short video using two avatars from another AI tool, HeyGen, combined with Google Audio, and the results were again surprising: the conversation video wasn’t perfect, but it was better than many of the videos of humans you can find on YouTube.
Because Google Audio is a single audio clip with both voices, I had to create a male and female version of the video and manually cut and match the voices. If I could export the two audio tracks separately, it would be much easier to create the video.
From a content and audio perspective, I’d say Google has managed to get past the “uncanny valley” – the video version with avatars is very good, but a little more likely to be detected as an AI.
There are a variety of possible use cases if Google were to turn what is currently a demo version into a working tool: it would need to allow users to choose the number of speakers, their ages, genders, accents, and languages, and specify length, topic details, etc. Generating video conversations with avatars within NotebookLM (or wherever this tool ends up) seems well within Google’s technical capabilities.
Even now, I find NotebookLM’s audio summarization feature useful; one colleague has taken a long article and used it to create a compelling audio summary. As my testing shows, it can produce an easy-to-understand summary of a book. I would also certainly consider offering an audio conversation version of an online article instead of or in addition to the audio narration version.
What would you use it for? Do you find this novel AI feature exciting, creepy, or just not that useful?