After Dall-E and ChatGPT, OpenAI stunned the world once again with its new Sora AI about a month ago. Sora is a text-to-video generative AI app that can make incredible videos out of simple text prompts just like Dall-E generates images on the spot from a few lines of text.
However, as impressive as the Sora demo might have been, it wasn’t a public launch. OpenAI only showed off the product, saying it would be available to the Red Teaming Network, which is “a community of trusted and experienced experts that can help to inform [the company’s] risk assessment and mitigation efforts.”
That public release is coming soon, with OpenAI’s Mira Murati saying in an interview it will launch at some point this year. While an actual release date wasn’t offered, the OpenAI exec seemed certain a 2024 public launch is in the cards for Sora.
Murati explained Sora to The Wall Street Journal’s Joanna Stern and the world with the help of new Sora-generated clips that are viewable in the video at the end of this post.
We learned that Sora clips need a few minutes to generate. The demo clips are 20-second videos at HD resolution (720p). The processing costs for generating these clips exceed Dall-E images or ChatGPT responses. But when Sora launches, OpenAI aims to make them about as affordable. Mirati did not reveal any pricing details for Sora, however. Or whether Sora will be available to ChatGPT Plus users.
The OpenAI exec also explained how the company trained Sora. The AI analyzed lots of videos from public sources and learned to identify various things. The AI can identify objects and actions. When analyzing prompts, it then sketches the scene based on that knowledge to generate a result. Here’s a video showing Sora capabilities that OpenAI shared a few weeks ago:
Murati only named Shutterstock as a potential source of videos that trained the AI. But everything publicly available could have been used to make Sora. That might include data from Facebook and YouTube, though the exec would not confirm those sources.
The Sora videos aren’t perfect, as you’ll see in the clip below. The AI might misinterpret prompts, and it might pose continuity issues. But Sora will be getting better, probably to the point where some of these videos might look almost as good as a real video that someone recorded in the wild.
To that end, OpenAI wants to ensure the videos are marked accordingly as AI creations They’ll have an OpenAI watermark and metadata information to point that out. I wonder if that will be enough to prevent abuse. Like someone purposefully creating clips meant to mislead the crowds.
One protection that OpenAI built into Sora mimicks Dall-E. You can’t generate images of public figures in Dall-E. The same will happen with Sora clips. OpenAI might have additional protections in place when concerning more sensitive prompts. Or the use of nudity in videos.
Will Sora launch publicly before the elections? That’s something we’ll have to wait and see. Mainly because this year is rich in elections around the world. If we’re talking US presidential elections in November, OpenAI doesn’t have a timeline.
The exec did say that misinformation and harmful bias are on its radar. She also made it clear that OpenAI will not release anything they’re not confident about when it comes to the direct impact on global elections and other issues.
As for the obvious threat to Hollywood, Murati said OpenAI wants Sora to be a tool for “extending creativity.” It wants creators to be part of the process and inform OpenAI on how to develop and deploy Sora.
You should check out the video below for the full interview and see examples of Sora-generated clips. The animated bull in the china shop certainly takes the cake: