OpenAI Launches Three Realtime Voice Models with GPT-5-Class Reasoning and 70-Language Live Translation

Read in other languages: 한국어日本語
LLM May 11, 2026 By Insights AI 1 min read Source

OpenAI launched three specialized Realtime voice models through its Realtime API on May 7, 2026: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. Alongside the launch, the Realtime API exited beta and became generally available — a milestone for developers waiting to build production voice systems on the platform.

GPT-Realtime-2: GPT-5-Class Voice Agent

GPT-Realtime-2 is OpenAI's most capable voice model, bringing GPT-5-class reasoning to real-time conversations. The context window expands from 32,000 to 128,000 tokens, enabling longer multi-step workflows. The model handles interruptions gracefully and supports tool calls within live voice sessions.

GPT-Realtime-Translate: 70-Language Live Translation

GPT-Realtime-Translate supports streaming translation from 70+ input languages into 13 output languages in real time, at $0.034 per minute. The model is designed for customer support and interpretation applications that previously required dedicated translation infrastructure.

GPT-Realtime-Whisper: Streaming Transcription

GPT-Realtime-Whisper converts speech to text as the speaker talks, with streaming output that updates live. At $0.017 per minute, it is the most affordable of the three. Use cases include live captioning, automated meeting notes, and accessibility tooling.

Full announcement: OpenAI blog.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment