Hacker News Spots a Tiny CPU-First TTS Release: Kitten TTS v0.8
Original: Show HN: Three new Kitten TTS models – smallest less than 25MB View original →
Hacker News also surfaced a smaller but highly practical AI project on March 19, 2026: the Show HN post for Kitten TTS v0.8. The thread reached 308 points and 104 comments during this crawl. The repository presents Kitten TTS as an ONNX-based, CPU-friendly text-to-speech library with 15M, 40M, and 80M parameter models, covering roughly 25 MB to 80 MB on disk and producing 24 kHz audio.
What made the post resonate is the gap it tries to fill. A lot of speech tooling is either cloud-dependent, GPU-oriented, or too heavy for simple local use. Kitten TTS instead emphasizes small binaries, offline use, and a simple Python API. The project ships with eight built-in voices and exposes speed control, which makes it interesting for edge deployment, local assistants, and lightweight desktop applications.
What the HN discussion surfaced
- Users were impressed by the quality-per-size ratio; one commenter said the 80M model ran at about 1.5x realtime on an Intel 9700 CPU.
- Others immediately tested edge cases like numbers, units, and voice naturalness, with several comments asking for better pronunciation and more neutral voices.
- There were also practical concerns about packaging, because some installs pulled large dependencies that felt out of step with the project's "tiny" positioning.
The thread also raised a healthy question that shows how the open-source TTS market has matured: what data trained the voices, and what licensing or provenance guarantees come with that data? For small local models, deployment convenience is only part of the evaluation. Users increasingly want clarity on dataset sourcing, language coverage, and whether a project is ready for serious production work.
Kitten TTS is still marked as a developer preview, so the HN reaction should be read as strong early interest rather than final validation. Even so, the post highlights a real demand for compact speech models that can run on ordinary CPUs and still sound good enough to be useful outside a benchmark demo.
Related Articles
A March 9, 2026 LocalLLaMA discussion highlighted Fish Audio’s S2 release, which combines fine-grained inline speech control, multilingual coverage, and an SGLang-based streaming stack.
A high-upvote LocalLLaMA thread highlighted KittenTTS v0.8, with community-shared details on 80M/40M/14M model variants, Apache-2.0 licensing, and an edge-friendly focus on local CPU inference.
A r/MachineLearning discussion reported that one INT8 ONNX model produced large on-device accuracy variance across five Snapdragon chipsets, from 91.8% down to 71.2%, despite identical weights and export settings.
Comments (0)
No comments yet. Be the first to comment!