Aug 29, 2025 at 4:53 AM

OpenAI updates the Realtime API with gpt-realtime, its most advanced voice AI model yet

OpenAI’s Realtime API is now generally available after first launching in October 2024, bringing what the company calls its best voice AI model yet: gpt-realtime. This speech-to-speech system processes and generates audio directly without converting to text, delivering faster and more natural interactions. It can interpret nonverbal cues, supports function calls, switch languages mid-sentence, adjust tone or accent, and generate speech with emotional inflections. Benchmark results highlight its progress, with Big Bench Audio at 82.8%, MultiChallenge at 30.5%, and ComplexFuncBench at 66.5%.

Developers also gain enhanced integration options, including support for Session Initiation Protocol (SIP) to enable phone calling and remote Model Context Protocol (MCP) servers for connecting external tools and services. Additional features include reusable prompts, token limits, and session-trimming controls to manage costs. Image input support enables screenshots or photos to be processed for text reading or content-based queries, with permissions configurable by developers.

OpenAI also added two new synthetic voices, Cedar and Marin, alongside updates to existing ones. Pricing has been reduced by 20%, with audio input tokens at $32 per million and cached tokens at $0.40 per million. For EU users and privacy-sensitive businesses, data can be stored within the European Union under stricter compliance rules. The updated tools are available now through the Playground and official API documentation.

Aug 29, 2025 by Mauricio B. Holguin

city_zen found this interesting

MORE ABOUT: #AI Chatbots #Large Language Model (LLM) Tools #AI Writing Tools #ChatGPT

ChatGPT

446

AI Chatbot
Freemium
Proprietary

ChatGPT is a generative AI chatbot developed by OpenAI, launched in 2022. It utilizes the GPT-4 large language model (LLM) to deliver AI-powered conversational experiences. As a web-based chat bot, it offers users interactive and dynamic dialogue capabilities. With a rating of 4.4, it stands out for its advanced AI features.

External links

Introducing gpt-realtime and Realtime API updates for production voice agents
OpenAI Blog • Official source
OpenAI gives its voice agent superpowers to developers - look for more apps soon
ZDNET
OpenAI’s real-time API picks up laughter, accents, and switches languages in real time
THE DECODER
OpenAI and Microsoft debut new voice models
SiliconANGLE

No comments so far, maybe you want to be first?

OpenAI updates the Realtime API with gpt-realtime, its most advanced voice AI model yet

Related news

External links