LocalLLaMA developer shares Whisper silence hallucination fixes from production logs
Original: We collected 135 phrases Whisper hallucinates during silence — here's what it says when nobody's talking and how we stopped it View original →
A high-engagement post in r/LocalLLaMA describes a practical failure mode many teams see only after deployment: Whisper generating fluent text when there is no speech. The author says they observed the issue across thousands of production meeting-audio hours and published a blocklist of recurring outputs.
The post reports 135 recurring English phrases, including common outro-like strings such as "Thanks for watching" and repeated loop patterns that can continue for long spans. The author argues this is a decoder behavior, not random garbage text: when audio is silent, Whisper can still produce likely completions from its training distribution.
The mitigation stack shared in the post is operationally specific:
- Silero VAD pre-gating so non-speech audio is filtered before Whisper runs (threshold 0.5, stop after 3 consecutive non-voice frames).
- Setting
condition_on_previous_text=Falseto stop error carryover between windows. - Using exact-string blocklists by language for known recurring hallucinations.
- Detecting repeated outputs and force-advancing timestamps on loop patterns.
- Using
beam_size=1so silence failures terminate faster than wider beam search.
The author also cites the FAccT 2024 "Careless Whisper" paper and highlights safety risk in domains like medical transcription, where false fluent text can be more dangerous than blanks. The linked repository includes a publicly shared hallucination list (hallucinations/en.txt), which currently shows 134 text lines plus metadata headers in the raw file.
This is community-reported evidence rather than a controlled benchmark, but the post is useful because it translates a known model behavior into concrete production guardrails that teams can immediately test.
Community source: r/LocalLLaMA post
Referenced repo: Vexa (open-source)
Related Articles
OpenAI said Codex Security is rolling out in research preview via Codex web. The company positioned it as a context-aware application security agent that reduces noise while surfacing higher-confidence findings and patches.
A high-engagement r/MachineLearning discussion introduced IronClaw, a Rust-based AI agent runtime designed around sandboxed tool execution, encrypted credential handling, and database-backed policy controls. The post landed because it treats agent security as a systems problem instead of a prompt-only problem.
A high-signal Hacker News thread surfaced an essay arguing that AI-assisted clean-room rewrites may be legal without being socially legitimate, using the chardet 7.0 relicensing fight as the case study.
Comments (0)
No comments yet. Be the first to comment!