OpenAI updates the Realtime API with gpt-realtime, its most advanced voice AI model yet

OpenAI updates the Realtime API with gpt-realtime, its most advanced voice AI model yet

OpenAI’s Realtime API is now generally available after first launching in October 2024, bringing what the company calls its best voice AI model yet: gpt-realtime. This speech-to-speech system processes and generates audio directly without converting to text, delivering faster and more natural interactions. It can interpret nonverbal cues, supports function calls, switch languages mid-sentence, adjust tone or accent, and generate speech with emotional inflections. Benchmark results highlight its progress, with Big Bench Audio at 82.8%, MultiChallenge at 30.5%, and ComplexFuncBench at 66.5%.

Developers also gain enhanced integration options, including support for Session Initiation Protocol (SIP) to enable phone calling and remote Model Context Protocol (MCP) servers for connecting external tools and services. Additional features include reusable prompts, token limits, and session-trimming controls to manage costs. Image input support enables screenshots or photos to be processed for text reading or content-based queries, with permissions configurable by developers.

OpenAI also added two new synthetic voices, Cedar and Marin, alongside updates to existing ones. Pricing has been reduced by 20%, with audio input tokens at $32 per million and cached tokens at $0.40 per million. For EU users and privacy-sensitive businesses, data can be stored within the European Union under stricter compliance rules. The updated tools are available now through the Playground and official API documentation.

by Mauricio B. Holguin

cz
city_zen found this interesting
ChatGPT iconChatGPT
  421
  • ...

ChatGPT is a generative AI chatbot developed by OpenAI, launched in 2022. It utilizes the GPT-4 large language model (LLM) to deliver AI-powered conversational experiences. As a web-based chat bot, it offers users interactive and dynamic dialogue capabilities. With a rating of 4.4, it stands out for its advanced AI features.

No comments so far, maybe you want to be first?
Gu