LiveKit adds xAI TTS to Inference with 20+ languages and no separate API key

LiveKit said on X on March 16, 2026 that xAI’s Grok text-to-speech is now available inside LiveKit Inference. The post describes the integration as a low-latency, production-ready path for voice agents, highlighting multilingual support, telephony readiness, and simpler access for developers building real-time voice systems.

The linked LiveKit documentation fills in the implementation details. It says xAI TTS is available through LiveKit Agents via both LiveKit Inference and a direct xAI plugin. For the managed path, developers can use the model xai/tts-1 without provisioning a separate xAI API key, which lowers the setup overhead for teams already running their agents on LiveKit’s stack.

LiveKit also says the model supports more than 20 languages, including English, Japanese, Korean, Chinese, Hindi, Portuguese, Spanish, Turkish, and Vietnamese. The docs show that developers can select a voice directly in an AgentSession and optionally pass language settings and other parameters through the inference TTS class. That makes the integration more than a generic wrapper. It is being presented as a first-class component inside the broader LiveKit agent framework.

For teams that want direct control, LiveKit also documents a separate plugin path that uses XAI_API_KEY and the livekit-agents[xai] package. That split is strategically important. It gives developers a choice between convenience through LiveKit Inference and direct vendor integration when they need their own authentication, billing, or custom deployment setup.

The significance is broader than one TTS connector. Voice agents are becoming more multimodal, more international, and more tightly integrated with phone systems and real-time application flows. By adding xAI TTS into LiveKit Inference, LiveKit is making it easier for developers to plug another frontier-model vendor into that stack without rebuilding their audio pipeline from scratch.

LiveKit adds xAI TTS to Inference with 20+ languages and no separate API key

Related Articles

Grok Voice agents now cost $0.05 per minute to build

LiveKit makes adaptive interruption handling generally available for voice agents

Mistral launches Voxtral TTS as a low-latency multilingual speech layer for voice agents