ChatGPT Delayed A Feature That People Shouldn't Be Allowed To Use Anyway

By Chris Smith May 9, 2025 6:38 pm EST

OpenAI

ChatGPT has grown a lot in the past few years, with OpenAI releasing several exciting features along the way. ChatGPT can now reason to offer more in-depth answers to questions, and it produces detailed Deep Research reports on any chosen topic. Also impressive is the chatbot's ability to generate images and edit photos. Then there's Operator, an AI agent that lets ChatGPT browse the web for you. On top of that, OpenAI has released various models, including preview modes, and further improved the default ChatGPT model that most people use.

But there's one AI tool that OpenAI hasn't brought to ChatGPT or released as a separate AI program, despite announcing it more than a year ago. It's called Voice Engine, a piece of AI software that can clone a voice after listening to a single 15-second audio sample.

Needless to say, that's an incredibly scary feature to release out in the wild. I warned you about how dangerous it is the minute OpenAI announced it in late March 2024.

Voice cloning has abuse written all over it. I'm not referring just to malicious actors creating fake audio files by cloning the voices of politicians and celebrities, or hackers trying to swindle you. I'm also thinking about the average Joe who might think it's fun to clone a friend's voice and have them say god knows what.

More than a year later, OpenAI's voice cloning tool still isn't widely available in ChatGPT or as a standalone app. It's only accessible to a short list of partners, and there's no telling when OpenAI will release it into the wild.

I'm hoping that happens in a distant future, one where the larger audience is AI-savvy enough to tell cloned audio from a real voice, or OpenAI and other AI firms develop tech that clearly labels cloned voices as AI-generated.

I'm not saying there aren't legitimate uses for AI-powered voice-cloning tools. You could use such a tool for dubbing movies and TV shows in other languages while keeping the actor's original voice. That's a compelling use for AI-generated audio.

People with speech impediments or those who lose their voices due to medical conditions could also use a ChatGPT tool to speak to others.

Similarly, the ability to translate spoken language in real time while preserving the speaker's voice and tone could be incredibly useful in situations where other translation tools aren't available or as effective.

But regular people getting access to Voice Engine in ChatGPT or elsewhere will surely abuse it. Just look at what happened with all the deepfake images ChatGPT users created after the 4o image generation tool was released. And remember that OpenAI used laxer safety policies when releasing that tool.

Having Voice Engine out in the wild, with similarly easy safety policies in place, would only make it easier for malicious actors to abuse it for nefarious purposes.

Thankfully, it doesn't look like OpenAI plans to release Voice Engine widely anytime soon. The AI firm told TechCrunch that it continues to test the feature with a limited set of trusted partners:

[We're] learning from how [our partners are] using the technology so we can improve the model's usefulness and safety. We've been excited to see the different ways it's being used, from speech therapy, to language learning, to customer support, to video game characters, to AI avatars.

TechCrunch points out that OpenAI wanted to release Voice Engine to its API on March 7, 2024, as Custom Voices. The original plan was to entrust 100 developers with the feature, as long as they were building apps providing a "social benefit," or showed "innovative and responsible" uses of the technology. OpenAI even trademarked it and set prices for it.

But Voice Engine never became available. Instead, OpenAI postponed the launch and gave Voice Engine a public announcement later that month, without opening sign-ups.

I think that was and still is the better move. Again, the massive success of ChatGPT's new image generation powers is proof that people will abuse AI technology that's easy to use.

OpenAI isn't the only AI lab creating voice-cloning tools. We've already seen deepfakes involving AI tools that let people clone the voices of celebrities for malicious purposes. We've also heard of scams using phone calls in which hackers cloned the voices of other people, including loved ones.

All that happened without ChatGPT offering users a Voice Engine mode to clone voices. But having OpenAI release such a tool could make it even easier for malicious actors to use it for all sorts of schemes.

It would also be incredibly affordable, assuming last year's prices that TechCrunch reported remain in place. OpenAI wanted to charge $15 per million tokens for standard voices and $30 per million tokens for HD-quality voices. That's extremely cheap, especially if you want to use the tech to manipulate people with deepfakes or run more sophisticated attacks involving cloned voices.

Thankfully, OpenAI was aware of the potential for abuse of Voice Engine, calling out those risks in last year's blog post. That likely explains the continued delay. OpenAI may have wanted to avoid controversy in an election year, which could be why Voice Engine didn't launch last year. But elections will keep coming.

Also, reports have pointed out that AI voice cloning was the third fastest-growing scam of 2024. That's an even bigger reason to to keep Voice Engine out of most people's hands.

ChatGPT Delayed A Feature That People Shouldn't Be Allowed To Use Anyway

Recommended