The more you use ChatGPT and similar generative AI services, the better you’ll get at issuing prompts that deliver the results you want. And you’ll discover new features and capabilities that aren’t immediately obvious to new users. But it turns out there’s a brilliant ChatGPT feature that you’re probably not using, and that’s because it’s hidden in plain sight inside the official ChatGPT app for iPhone.
It’s a voice transcription feature so good it’ll completely change how you transcribe audio.
Voice transcription is only available on ChatGPT for iPhone
You don’t use your voice to talk to ChatGPT on a computer. You load the generative AI in a browser and start typing away. But I fully expect to use voice to talk to ChatGPT-like AI products in the future. Especially on devices like Apple’s new Vision Pro spatial computer that can benefit from such functionality.
It turns out that OpenAI added voice input to the iPhone app. A feature I hadn’t noticed until stumbling on Insider’s piece that describes the surprising transcription feature.
Allow the official app to access your iPhone’s microphone, tap that icon on the right of the text input field, and ChatGPT will record the audio it hears. That’s a great feature if you work with audio recordings like interviews, podcasts, and videos and need to extract text from them.
Insider found that ChatGPT could accurately transcribe someone’s speech down to the punctuation. The difference between ChatGPT on iPhone and a different AI app was “remarkable.” Insider says that “you can almost hear the person speaking” in the ChatGPT version.
How good is it?
Naturally, I went on to test it on a recent one-minute promo clip from Marvel’s Secret Invasion Disney Plus TV show, which you’ll see below.
ChatGPT recorded everything, and then I just issued that transcription as a prompt. Silly ChatGPT offered a response, which I ignored. The point here was to test the transcription feature, which is really good. Not to mention that now I have the same chat on my Mac, where I can export the transcription and edit it.
I will say the transcription isn’t perfect, but that might be my fault. I should have raised the audio volume.
Another thing to note here is that the audio doesn’t recognize different people talking. But that’s because it’s not programmed to do so. Therefore, if you plan on using ChatGPT as a voice transcription service, you’ll need to pay attention to who is speaking.
The good news in all of this is that OpenAI CEO Sam Altman told Insider that ChatGPT uses another OpenAI tech called Whisper. And Whisper is so good because OpenAI used large amounts of audio data from the internet to train the AI without human supervision.
That means the AI sort of trained itself to understand any audio by processing large amounts of audio data.
I’m excited about the future of using voice to talk to AI
With that in mind, there’s clearly a case here for transcription apps built with Whisper that would offer additional features. Like recognizing each independent speaker, offering timestamps, and allowing you to quickly browse through an audio file via prompts. And these apps should do the transcription almost instantaneously, without you actually opening the file.
While I’m just shooting ideas off the top of my head, I’m sure we’ll see such AI services down the road from OpenAI or other companies.
There’s already an app like that for Mac that was discovered by Insider. Whisper Transcription is available as a free download from the Mac App Store. It’s been there for more than four years, apparently.
Also, Whisper tech makes me excited about having voice-based conversations with AI on the Vision Pro. But we’ll cross that bridge when we get there.
That said, here’s the clip that ChatGPT transcribed for me, so you can see for yourself how good the feature is.