#edge-ai

AI Hacker News Mar 20, 2026 2 min read

Hacker News Spots a Tiny CPU-First TTS Release: Kitten TTS v0.8

Kitten TTS v0.8 drew Hacker News attention by promising ONNX-based speech synthesis in 15M to 80M models that can run locally on CPUs, while commenters stress-tested real-world usability.

#tts #onnx #edge-ai

LLM Mar 14, 2026 2 min read

IBM Releases Granite 4.0 1B Speech for Edge-Ready Multilingual ASR and Speech Translation

IBM unveiled Granite 4.0 1B Speech on March 9, 2026 as a compact multilingual speech-language model for ASR and bidirectional speech translation. The company says it improves English transcription accuracy over its predecessor while cutting model size in half and adding Japanese support.

#ibm #granite #speech

LLM Mar 6, 2026 2 min read

Microsoft Research Highlights Tiny Reasoning Models for Faster On-Device AI

Microsoft Research presented new tiny language model (TLM) results focused on reasoning efficiency at edge scale. The post emphasizes bitnet-based small models, 2-bit ternary weights, and reported gains of up to 8x speed with 4x lower memory in selected environments.

#microsoft #tiny-language-models #edge-ai

AI Hacker News Feb 25, 2026 2 min read

Moonshine Open-Weights STT Gains Traction on HN with Whisper-Large-v3 Claims

A Show HN post spotlighted Moonshine Voice, an open-source speech toolkit claiming strong accuracy and latency across edge and desktop devices. The project positions itself as a practical alternative to larger Whisper deployments for real-time voice apps.

#speech-recognition #asr #edge-ai

AI Reddit Feb 22, 2026 1 min read

Taalas: Etching LLM Weights Directly into Silicon Achieves 16,000 Tokens/Second

Startup Taalas is taking a radical approach to AI inference: etching LLM model weights and architecture directly into a silicon chip. Their Llama 3.1 8B demo achieves 16,000 tokens per second — but the approach bets that model architectures won't change.

#ai-hardware #silicon #llm

AI Hacker News Feb 22, 2026 1 min read

zclaw: A Personal AI Assistant in Under 888 KB, Running on an ESP32

zclaw is an open-source personal AI assistant that fits in under 888 KB and runs on an ESP32 microcontroller. Part of the emerging Claw ecosystem, it demonstrates how far edge AI has come.

#esp32 #embedded #ai-assistant

AI Reddit Feb 21, 2026 2 min read

Reddit Spotlights KittenTTS v0.8: Open Tiny TTS Stack Aimed at CPU and Edge Deployment

A high-upvote LocalLLaMA thread highlighted KittenTTS v0.8, with community-shared details on 80M/40M/14M model variants, Apache-2.0 licensing, and an edge-friendly focus on local CPU inference.

#tts #edge-ai #open-source

LLM Reddit Feb 20, 2026 2 min read

LocalLLaMA spotlights Kitten TTS v0.8 for compact on-device speech

A widely discussed LocalLLaMA post introduces open Kitten TTS v0.8 models (80M/40M/14M), emphasizing CPU-friendly deployment and sub-25MB footprint for the smallest variant.

#tts #localllama #edge-ai

AI Reddit Feb 18, 2026 1 min read

Reddit ML report: same INT8 ONNX model showed major accuracy drift across Snapdragon tiers

A r/MachineLearning discussion reported that one INT8 ONNX model produced large on-device accuracy variance across five Snapdragon chipsets, from 91.8% down to 71.2%, despite identical weights and export settings.

#edge-ai #quantization #snapdragon