Skip to content

OpenAI Releases Three Realtime Voice API Models with GPT-5-Class Reasoning

Read in other languages: 한국어日本語
LLM May 16, 2026 By Insights AI 1 min read 1 views Source

OpenAI has released three new real-time voice models in its API, graduating the Realtime API from beta to general availability. The models unlock a new class of voice applications covering live reasoning, multilingual translation, and streaming transcription.

The Three New Models

  • GPT-Realtime-2: OpenAI's first voice model with GPT-5-class reasoning, capable of handling complex requests, calling multiple tools simultaneously, and managing interruptions while keeping the conversation flowing naturally. Priced at $32 per million audio input tokens and $64 per million output tokens.
  • GPT-Realtime-Translate: A live translation model supporting 70+ input languages into 13 output languages, keeping pace with the speaker in real time. Priced at $0.034/minute.
  • GPT-Realtime-Whisper: A streaming speech-to-text model that transcribes as the speaker talks. Priced at $0.017/minute.

Realtime API Goes GA

The Realtime API exits beta with this release, making it production-ready for the first time. Developers can now build voice apps that process audio directly in a continuous stream, eliminating latency from separate transcription and synthesis stages. Full details are in the OpenAI blog post.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment