Show HN: Building a Sub-500ms Latency Voice Agent from Scratch
Original: Show HN: I built a sub-500ms latency voice agent from scratch View original →
400ms Voice AI: What It Takes
Developer Nick Tikhonov shared a Show HN project (122 upvotes) detailing how he built a voice agent averaging ~400ms end-to-end latency — from phone stop to first syllable — with a complete STT → LLM → TTS pipeline, clean barge-ins, and no precomputed responses.
What Actually Moved the Needle
- Semantic End-of-Turn Detection: VAD alone fails for natural conversation. You need semantic understanding of when someone is truly done speaking
- Streaming is Non-Negotiable: Sequential pipelines are dead on arrival. STT → LLM → TTS must all stream
- TTFT Dominates: Groq's ~80ms time-to-first-token was the single biggest performance win
- Geography Over Prompts: Colocating all components mattered more than any prompt optimization
The Core Loop
The system reduces to two states — speaking vs. listening — and two critical transitions: cancel instantly on barge-in, respond instantly on end-of-turn. These transitions define the entire user experience. Voice is fundamentally a turn-taking problem, not a transcription problem.
Open Source
The project is available on GitHub as 'shuo'. For developers building real-time voice AI systems, this implementation offers a practical, battle-tested reference for achieving sub-500ms conversational latency.
Related Articles
Anthropic said on March 5, 2026 that it had received a supply-chain risk designation letter from the Department of War. The company says the scope is narrow, plans to challenge the action in court, and will continue transition support for national-security users.
Anthropic published a March 5, 2026 report proposing observed exposure, a labor-impact metric that combines theoretical LLM capability with real usage patterns. The paper finds early hiring signals in exposed occupations but no broad unemployment shock yet.
Meta announced new anti-scam protections across WhatsApp, Facebook, and Messenger on March 11, 2026. The company also detailed broader AI-based scam detection, enforcement statistics, and a plan to raise advertiser verification so verified advertisers account for 90% of ad revenue by the end of 2026.
Comments (0)
No comments yet. Be the first to comment!