Show HN: Building a Sub-500ms Latency Voice Agent from Scratch
Original: Show HN: I built a sub-500ms latency voice agent from scratch View original →
400ms Voice AI: What It Takes
Developer Nick Tikhonov shared a Show HN project (122 upvotes) detailing how he built a voice agent averaging ~400ms end-to-end latency — from phone stop to first syllable — with a complete STT → LLM → TTS pipeline, clean barge-ins, and no precomputed responses.
What Actually Moved the Needle
- Semantic End-of-Turn Detection: VAD alone fails for natural conversation. You need semantic understanding of when someone is truly done speaking
- Streaming is Non-Negotiable: Sequential pipelines are dead on arrival. STT → LLM → TTS must all stream
- TTFT Dominates: Groq's ~80ms time-to-first-token was the single biggest performance win
- Geography Over Prompts: Colocating all components mattered more than any prompt optimization
The Core Loop
The system reduces to two states — speaking vs. listening — and two critical transitions: cancel instantly on barge-in, respond instantly on end-of-turn. These transitions define the entire user experience. Voice is fundamentally a turn-taking problem, not a transcription problem.
Open Source
The project is available on GitHub as 'shuo'. For developers building real-time voice AI systems, this implementation offers a practical, battle-tested reference for achieving sub-500ms conversational latency.
Related Articles
HN pushed this past 400 comments because the story was not just nostalgia. It asked what evidence of student thinking should look like when AI can produce the polished draft.
Axios reports the NSA is using Anthropic's Mythos Preview even as Pentagon officials call the company a supply-chain risk. The clash puts AI safety limits, federal cyber demand, and procurement politics in the same room.
TNW reports that Google is discussing two AI chips with Marvell: a memory processing unit and an inference-focused TPU. No contract is signed yet, but the talks show how serving models, not just training them, is driving custom silicon strategy.
Comments (0)
No comments yet. Be the first to comment!