#voice-agents

LLM sources.twitter 3d ago 2 min read

xAI ships Grok Voice Think Fast 1.0 with τ-voice lead

xAI is turning voice agents into production software, not a demo. Grok Voice Think Fast 1.0 tops τ-voice Bench, supports 25+ languages, and xAI says the same stack is driving a 20% sales conversion and 70% support resolution flow at Starlink.

#xai #grok-voice #voice-agents

AI sources.twitter Apr 5, 2026 2 min read

Mistral launches Voxtral TTS as a low-latency multilingual speech layer for voice agents

Mistral AI said on March 26, 2026 that Voxtral TTS offers expressive speech, support for 9 languages and dialects, low latency, and easy adaptation to new voices. Mistral’s March 23 launch post says the 4B-parameter model can adapt from about three seconds of reference audio, reaches roughly 70ms model latency, supports up to two minutes of native audio generation, and is available by API and as open weights.

#mistral #tts #voice-agents

LLM sources.twitter Apr 3, 2026 2 min read

Google AI launches Gemini 3.1 Flash Live for real-time voice and vision agents

Google AI said on March 26, 2026 that Gemini 3.1 Flash Live is launching for developers building real-time voice and vision agents. Google highlighted faster natural dialogue, better task completion in noisy environments, and stronger complex-instruction following, while its Live API docs describe low-latency multimodal streaming with tool use and 70-language support.

#google-ai #gemini #live-api

LLM sources.twitter Mar 30, 2026 2 min read

OpenAI and Perplexity share production lessons from scaling voice agents with the Realtime API

OpenAI Developers said on March 30, 2026 that Perplexity has been running voice experiences with the Realtime API in production and published lessons from that work. The post says Perplexity now handles millions of monthly voice sessions and details how the team changed context chunking, standardized audio formats, and tuned turn-taking for noisy real-world environments.

#openai #realtime-api #voice-agents

AI sources.twitter Mar 27, 2026 1 min read

Mistral pushes Voxtral TTS as a 4B open-weight voice agent layer

Mistral promoted Voxtral TTS on X on March 26, 2026. Mistral's release post describes a 4B-parameter multilingual TTS model with nine-language support, low time-to-first-audio, availability in Mistral Studio and API, open weights on Hugging Face under CC BY-NC 4.0, and pricing at $0.016 per 1,000 characters.

#mistral #text-to-speech #voice-agents

LLM sources.twitter Mar 26, 2026 2 min read

Google DeepMind launches Gemini 3.1 Flash Live for low-latency voice and vision agents

Google DeepMind said on March 26, 2026 that Gemini 3.1 Flash Live is rolling out in preview via the Live API in Google AI Studio. Google’s blog says the model is designed for real-time voice and vision agents, improves tool triggering in noisy environments, and supports more than 90 languages for multimodal conversations.

#google-deepmind #gemini #live-api

AI sources.twitter Mar 23, 2026 2 min read

LiveKit makes adaptive interruption handling generally available for voice agents

LiveKit said on March 19, 2026 that it trained an audio model that can distinguish real user interruptions from backchannels and other noise. The company’s blog says the feature is now generally available in LiveKit Agents, delivers 86% precision and 100% recall at 500 ms overlap speech, and is enabled by default in current Python and TypeScript agent SDKs.

#livekit #voice-agents #speech

AI sources.twitter Mar 20, 2026 1 min read

LiveKit adds xAI TTS to Inference with 20+ languages and no separate API key

LiveKit said on X that xAI’s Grok text-to-speech is now available in LiveKit Inference with low-latency streaming, telephony readiness, and support for more than 20 languages. LiveKit’s docs say developers can access `xai/tts-1` through LiveKit Inference without a separate xAI API key or use the xAI plugin directly with `XAI_API_KEY`.

#livekit #xai #tts

AI sources.twitter Mar 14, 2026 2 min read

Together AI Packages a One-Cloud Voice-Agent Stack for Real-Time Deployment

Together AI said on March 12, 2026 that it is launching a one-cloud stack for real-time voice agents. Its public materials describe co-located STT, LLM, and TTS infrastructure with under-500ms latency, 25+ regions, and separate kernel work that cut time-to-first-64-tokens to 77ms in a voice-agent deployment.

#voice-agents #inference #realtime

LLM Hacker News Mar 8, 2026 1 min read

Running Nvidia PersonaPlex 7B in Swift on Apple Silicon moves local voice agents closer to real time

An HN post on a Swift/MLX port of Nvidia PersonaPlex 7B shows how chunking, buffering, and interrupt handling matter as much as raw model quality for local speech-to-speech agents.

#speech-to-speech #apple-silicon #mlx

AI Reddit Feb 21, 2026 2 min read

Reddit Spotlights KittenTTS v0.8: Open Tiny TTS Stack Aimed at CPU and Edge Deployment

A high-upvote LocalLLaMA thread highlighted KittenTTS v0.8, with community-shared details on 80M/40M/14M model variants, Apache-2.0 licensing, and an edge-friendly focus on local CPU inference.

#tts #edge-ai #open-source