xAI Launches Voice Cloning API: Create a Custom Voice in Under 2 Minutes
Original: xAI Launches Voice Cloning API: Create a Custom Voice in Under 2 Minutes View original →
xAI Launches Voice Cloning via API
xAI officially launched Voice Cloning through its API on May 1, 2026. Users can clone a custom voice in under two minutes from a brief audio recording, or choose from a library of 80+ pre-built voices across 28 languages for use in voice agents, audiobooks, video game characters, and more.
Two-Stage Verification for Voice Ownership
Every custom voice creation requires a two-stage verification process. First, the user reads a verification phrase that the speech-to-text engine transcribes and matches in real time. Then, speaker embeddings are computed to confirm the voice belongs to the same person, preventing cloning from pre-existing recordings or someone else's voice.
80+ Voice Library Across 28 Languages
The Voice Library includes over 80 voices spanning 28 languages. Developers can preview, select, and manage voices directly from the xAI console. Custom voices integrate immediately with Grok Text to Speech and Voice Agent APIs at no additional charge.
Key Use Cases
- Voice Agents: Personalized AI assistants and customer service bots
- Audiobooks: Content narrated in an author's actual voice
- Gaming: Unique character voices at scale
With this launch, xAI significantly expands Grok's audio capabilities, giving developers a powerful personalization toolkit for voice-first applications.
Related Articles
xAI officially launched Voice Cloning through its API, allowing users to clone a custom voice in under 2 minutes or select from 80+ pre-built voices across 28 languages for voice agents, audiobooks, and game characters.
Why it matters: xAI has turned the Grok Voice stack into standalone STT/TTS APIs with batch transcription at $0.10/hour and streaming at $0.20/hour. The post puts 25+ languages, diarization, and word-level timestamps in direct competition with enterprise transcription tools.
xAI said on March 16, 2026 that Grok's Text-to-Speech API is now available. xAI's own voice docs describe a beta API with five voices, inline speech tags, telephony-friendly codecs, and a streaming WebSocket mode for low-latency applications.
Comments (0)
No comments yet. Be the first to comment!