OpenAI launched GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper — new voice API models covering live reasoning, real-time translation across 70+ languages, and streaming transcription. The Realtime API is now generally available for production use.
OpenAI on May 7 released three new realtime audio models in its API: GPT-Realtime-2 with GPT-5-class reasoning, GPT-Realtime-Translate for live speech translation across 70+ languages, and GPT-Realtime-Whisper for streaming transcription.
Anthropic's Claude Platform is now generally available on AWS, offering full Claude API feature parity with AWS IAM authentication, CloudTrail audit logging, and a single AWS invoice that retires against existing commitments.
OpenAI released GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper on May 7, 2026 via the Realtime API, which simultaneously exited beta. The suite brings GPT-5-class reasoning to live voice agents, real-time translation across 70+ languages, and streaming transcription.
Google has updated the Gemini API File Search tool to support multimodal content including images, audio, and video, making it easier for developers to build efficient, verifiable RAG systems.
OpenAI has launched GPT-Realtime-2 in its API, bringing GPT-5-class reasoning to real-time voice interactions. The release also includes GPT-Realtime-Translate for live multilingual speech translation and GPT-Realtime-Whisper for streaming transcription.
xAI has released Grok 4.3 on its API, claiming top spots on agentic tool calling and instruction-following leaderboards, and ranking #1 in enterprise domains such as case law and corporate finance. It supports a 1M token context window at $1.25/M input and $2.50/M output.
A benchmark comparing vision agents (browser-use) to structured API agents on the same admin panel found vision agents cost roughly 45x more — and failed to complete the task without a 14-step explicit walkthrough.
xAI officially launched Voice Cloning through its API, allowing users to clone a custom voice in under 2 minutes or select from 80+ pre-built voices across 28 languages for voice agents, audiobooks, and game characters.
xAI officially launched Voice Cloning through its API, allowing users to clone a custom voice in under 2 minutes or select from 80+ pre-built voices across 28 languages for voice agents, audiobooks, and game characters.
HN did not greet GPT-5.5 with applause first. The thread went straight to pricing, context tiers, and whether the model actually behaves better once real coding work starts.
Why it matters: API availability is the moment a flagship model becomes something teams can actually wire into products. OpenAI’s developer account says GPT-5.5 brings fewer retries, and the official release page now lists API access with a 1M context window and updated pricing.