Thanks to major advances in text-to-video generation software in the past few years, AI-generated videos are now the norm. OpenAI has Sora out in the wild, Google has Veo, and Adobe has its Firefly video model. In the past few weeks, we’ve also seen a few interesting video-generation AI models from China that tease massive performance.
That means anyone with enough time and money for the AI credits needed to process video requests can create clips that look a lot like real videos captured with regular cameras.
We’ve seen all sorts of video concepts “shot” with genAI software, including deepfakes that place celebrities in AI clips. On that note, we shouldn’t be surprised to see anti-AI-deepfake laws pop up around the world in response to the increasing number of AI videos that abuse the technology.
But some of the AI videos out there are impressive and deserve recognition. For example, someone made an incredible crime TV series using AI. But what really blew my mind was the “behind-the-scenes” clip they created using the same AI software for that fake TV show.
Reddit user ButchersBrain posted a clip that’s about six minutes long on the social platform. Titled Echoes of the Abyss Season 01, the clip follows Detective Quinn as she investigates strange murders in a small fishing town somewhere in a Scandinavian country.
The clip contains several episodes, but we only get the “previously on…” recaps that precede episodes in actual TV shows. But they’re enough to give us an idea of the story and follow the characters.
Yes, the TV show has a set of recurring characters, including the star of the show. That’s a remarkable feat for a text-to-video AI program. Character permanence isn’t the only amazing thing about Echoes of the Abyss. The sets the AI imagined based on descriptions from the creator are incredible. You almost can’t tell you’re looking at an AI-generated clip.
The video isn’t perfect, as expected for AI-generated clips. You’ll notice facial, eye, and mouth inconsistencies, especially during speech. Also, AI still has trouble rendering text. But the result is mind-blowing nonetheless.
As the Redditor explained, this was all done with Google’s Veo2 video generation tool. The team behind it also worked with Claude on the script and needed plenty of takes to come up with the final shots:
The basic outline of the story was written by myself. Claude3 helped with fine-tuning certain parts and refining dialogue. Every shot in this has its own prompt, and not every shot was a first try and put into edit. Some shots have around 24 generations to get to the final shot.
I’m telling you all this because you must see the clip above before I show you the even crazier behind-the-scenes video the same team created. This clip is just 30 seconds long, and watching it will blow your mind.
The video feels real. It’s made to look as if someone shot it with a phone while walking the various sets. You’ll recognize the settings from the fake TV show, some of the props, and even the leading actress. Then there’s the sound, the camera movement, and people talking in the background. It all feels real.
You’ll find inconsistencies if you inspect it closely, but imagine watching this video on a phone while you’re scrolling social media. You might not even realize it’s an AI-generated video in the first place.
The Redditor said in the comments section that the sound is also generated with AI software. The team used MMaudio except for the voice in the last shot, made with ElevenLabs AI voice tech.
All of this will only get better in the coming years. As the genAI programs get upgraded and the costs get lower, I wouldn’t be surprised to see actual movies and TV shows made entirely with AI pretty soon. This will also open the door to abuse, as creating fake news with AI video will become even more trivial.