Hacker News Spots Kitten TTS Pushing 25 MB-to-80 MB CPU-First Speech Models

Small speech models were the whole pitch

On March 19, 2026, a Hacker News thread about Kitten TTS reached 512 points and 172 comments at crawl time. The linked project is an Apache-2.0 open-source text-to-speech library that tries to make local speech synthesis practical on CPUs, not only on GPUs or cloud APIs. That framing mattered on Hacker News because the value proposition is operational, not just academic: smaller artifacts, standard Python install, ONNX runtime, and a path to running on Raspberry Pi, low-end phones, browsers, or wearables.

The HN submission text says the release adds three models at roughly 80M, 40M, and 14M parameters, with the smallest variant staying under 25 MB when quantized. The GitHub README labels the current release as Kitten TTS v0.8 and lists model variants from 15M to 80M parameters, eight built-in English voices, adjustable speech speed, built-in text preprocessing, and 24 kHz output. The maintainers also say the models are CPU-optimized and distributed through Hugging Face, with one-line Python usage built around a KittenTTS class.

Why the post landed

The interesting part is not that lightweight TTS exists. It is that the project is pushing for a deployable middle ground between toy embedded demos and cloud-sized voice systems. Most speech stacks still ask teams to make a difficult trade-off: accept weak local quality, or send audio generation back to a server. Kitten TTS is explicitly arguing that a third option is becoming viable for English voice agents, kiosk systems, accessibility tools, and consumer devices where latency, privacy, or offline behavior matter more than absolute studio quality.

There are still limits. The README marks the project as a developer preview, notes reported issues with the smallest int8 model, and says multilingual release is still on the roadmap. So the more accurate interpretation is not “problem solved” but “the packaging and size profile are now credible enough to trigger serious experimentation.” That is exactly the kind of edge-AI milestone Hacker News tends to reward: a release that turns a previously awkward deployment story into something engineers can actually try this week.

Hacker News Spots Kitten TTS Pushing 25 MB-to-80 MB CPU-First Speech Models

Small speech models were the whole pitch

Why the post landed

Related Articles

Reddit ML report: same INT8 ONNX model showed major accuracy drift across Snapdragon tiers

Hacker News Spots a Tiny CPU-First TTS Release: Kitten TTS v0.8

zclaw: A Personal AI Assistant in Under 888 KB, Running on an ESP32

Related Articles

Reddit ML report: same INT8 ONNX model showed major accuracy drift across Snapdragon tiers
AI Reddit Feb 18, 2026 1 min read

Hacker News Spots a Tiny CPU-First TTS Release: Kitten TTS v0.8
AI Hacker News Mar 20, 2026 2 min read

zclaw: A Personal AI Assistant in Under 888 KB, Running on an ESP32
AI Hacker News Feb 22, 2026 1 min read