OpenAI announced Thursday that its API will now include a host of new voice intelligence features designed to help developers create apps that can speak, transcribe, and translate conversations with users.
The company’s new GPT‑Realtime‑2 is another voice model built to create realistic voice simulations that can converse with users. However, unlike the previous version (GPT-Realtime-1.5), this one is built with GPT-5-class inference, and OpenAI says it was created to handle more complex requests from users.
The company is also announcing GPT‑Realtime‑Translate. As the name suggests, it is designed to provide a real-time translation service that “keeps pace” with the user in a conversational format. This feature includes over 70 input languages (that is, the languages that you understand) and 13 output languages (that is, the languages that you relay to the speaker).
Finally, the company also announced a new transcription feature, GPT-Realtime-Whisper. This provides users with live speech-to-text capabilities that capture interactions as they occur.
“Together, the models we are launching move real-time audio from simple call-and-response to a working voice interface that allows you to listen, reason, translate, transcribe, and take action as the conversation unfolds,” the company said.
Who will benefit from these updates? The obvious target is businesses looking to expand their customer service capabilities. However, OpenAI also says its new features will benefit a wide range of sectors, including education, media, events, and creator platforms.
While these tools may seem useful from an enterprise perspective, they can also be easily exploited. The company said it has built guardrails to ensure the new features are not misused to commit spam, fraud and other forms of online abuse. According to OpenAI, the system has certain triggers built in that can “stop a conversation if it is detected to violate harmful content guidelines.”
tech crunch event
San Francisco, California
|
October 13-15, 2026
All new voice models are included in OpenAI’s Realtime API. Translate and Whisper are charged per minute, while GPT-Realtime-2 is charged based on token consumption.
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
