onlyTrustedInfo.comonlyTrustedInfo.comonlyTrustedInfo.com
Notification
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Reading: OpenAI upgrades its transcription and voice-generating AI models
Share
onlyTrustedInfo.comonlyTrustedInfo.com
Font ResizerAa
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
Search
  • News
  • Finance
  • Sports
  • Life
  • Entertainment
  • Tech
  • Advertise
  • Advertise
© 2025 OnlyTrustedInfo.com . All Rights Reserved.
Tech

OpenAI upgrades its transcription and voice-generating AI models

Last updated: March 20, 2025 1:00 pm
Oliver James
Share
4 Min Read
OpenAI upgrades its transcription and voice-generating AI models
SHARE

OpenAI is bringing new transcription and voice-generating AI models to its API that the company claims improve upon its previous releases.

For OpenAI, the models fit into its broader “agentic” vision: building automated systems that can independently accomplish tasks on behalf of users. The definition of “agent” might be in dispute, but OpenAI Head of Product Olivier Godemont described one interpretation as a chatbot that can speak with a businesses’ customers.

“We’re going to see more and more agents pop up in the coming months” Godemont told TechCrunch during a briefing. “And so the general theme is helping customers and developers leverage agents that are useful, available, and accurate.”

OpenAI claims that its new text-to-speech model, “gpt-4o-mini-tts,” not only delivers more nuanced and realistic-sounding speech but is more “steerable” than its previous-gen speech-synthesizing models. Developers can instruct gpt-4o-mini-tts on how to say things in natural language — for example, “speak like a mad scientist” or “use a serene voice, like a mindfulness teacher.”

Here’s a “true crime-style,” weathered voice:

And here’s a sample of a female “professional” voice:

Jeff Haris, a member of the product staff at OpenAI, told TechCrunch that the goal is to let developers tailor both the voice “experience” and “context.”

“In different contexts, you don’t just want a flat, monotonous voice,” Harris continued. “If you’re in a customer support experience and you want the voice to be apologetic because it’s made a mistake, you can actually have the voice have that emotion in it […] Our big belief, here, is that developers and users want to really control not just what is spoken, but how things are spoken.”

As for OpenAI’s new speech-to-text models, “gpt-4o-transcribe” and “gpt-4o-mini-transcribe,” they effectively replace the company’s long-in-the-tooth Whisper transcription model. Trained on “diverse, high-quality audio datasets,” the new models can better capture accented and varied speech, OpenAI claims, even in chaotic environments.

They’re also less likely to hallucinate, Harris added. Whisper notoriously tended to fabricate words — and even whole passages — in conversations, introducing everything from racial commentary to imagined medical treatments into transcripts.

“[T]hese models are much improved versus Whisper on that front,” Harris said. “Making sure the models are accurate is completely essential to getting a reliable voice experience, and accurate [in this context] means that the models are hearing the words precisely [and] aren’t filling in details that they didn’t hear.”

Your mileage may vary depending on the language being transcribed, however.

According to OpenAI’s internal benchmarks, gpt-4o-transcribe, the more accurate of the two transcription models, has a “word error rate” approaching 30% for Indic and Dravidian languages like Tamil, Telugu, Malayalam, and Kannada. That means that the model misses around three out of every 10 words in those languages.

OpenAI WER gpt-4o-transcribe
The results from OpenAI’s internal speech recognition benchmarks.Image Credits:OpenAI

In a break from tradition, OpenAI doesn’t plan to make its new transcription models openly available. The company historically released new versions of Whisper for commercial use under an MIT license.

Harris said that gpt-4o-transcribe and gpt-4o-mini-transcribe are “much bigger than Whisper” and thus not good candidates for an open release.

“[T]hey’re not the kind of model that you can just run locally on your laptop, like Whisper,” he continued. “[W]e want to make sure that if we’re releasing things in open source, we’re doing it thoughtfully, and we have a model that’s really honed for that specific need. And we think that end-user devices are one of the most interesting cases for open-source models.”

You Might Also Like

A tropical system could form in the Gulf. It could also be the next big flood

Hello, Shark Week — a new shark species discovered at Mammoth Cave National Park

Grand Canyon fire that was left to burn swells 50% after destroying historic lodge

Ancient DNA confirms New Mexico tribe’s link to famed Chaco Canyon site

Apple TV+ just canceled its longest-running comedy series

Share This Article
Facebook X Copy Link Print
Share
Previous Article Bluesky makes it easier for publishers to track referrals Bluesky makes it easier for publishers to track referrals
Next Article Shoaib Bashir: England spinner joins Glamorgan on loan from Somerset to get game time ahead of Test summer | Cricket News Shoaib Bashir: England spinner joins Glamorgan on loan from Somerset to get game time ahead of Test summer | Cricket News

Latest News

Why GeneDx Holdings Stock Crushed It Again Today
Why GeneDx Holdings Stock Crushed It Again Today
Finance July 30, 2025
Microsoft climbs to  trillion in after-hours trading on blowout earnings
Microsoft climbs to $4 trillion in after-hours trading on blowout earnings
Finance July 30, 2025
American distillers hoping for favorable trade deals amid Trump’s tariffs: ‘Industry should be immune’
American distillers hoping for favorable trade deals amid Trump’s tariffs: ‘Industry should be immune’
Finance July 30, 2025
Powerball winning numbers for July 30 drawing: 4 million jackpot
Powerball winning numbers for July 30 drawing: $384 million jackpot
Finance July 30, 2025
//
  • About Us
  • Contact US
  • Privacy Policy
onlyTrustedInfo.comonlyTrustedInfo.com
© 2025 OnlyTrustedInfo.com . All Rights Reserved.