#llm

LLM Hacker News Apr 7, 2026 2 min read

GuppyLM Turns LLM Training into a Readable 8.7M-Parameter Show HN Project

A recent Show HN post highlighted GuppyLM, a tiny education-first language model trained on 60K synthetic conversations with a deliberately simple transformer stack. The project stands out because readers can inspect and run the whole pipeline in Colab or directly in the browser.

#llm #education #pytorch

LLM Reddit Apr 6, 2026 2 min read

LocalLLaMA Showcases PokeClaw, a Fully On-Device Gemma 4 Agent for Android

A LocalLLaMA post drew attention to PokeClaw, an open-source Android prototype that runs Gemma 4 locally through LiteRT-LM and lets the model tap, swipe, type, open apps, send messages, and manage auto-replies without cloud inference.

#llm #android #gemma

LLM Hacker News Apr 6, 2026 2 min read

Hacker News Highlights Nanocode, a JAX/TPU Path to Train a Claude Code-Style Model for About $200

HN picked up Nanocode, an open JAX project that packages tokenizer training, pretraining, synthetic data generation, agentic SFT, and DPO into an end-to-end recipe for building a coding model on TPU infrastructure.

#llm #jax #tpu

LLM Hacker News Apr 6, 2026 2 min read

Hacker News Spots Gemma Gem, a Browser-Embedded Agent That Runs Gemma 4 With No Cloud

A Show HN thread highlighted Gemma Gem, a Chrome extension that runs Gemma 4 locally via WebGPU and exposes page-reading, clicking, typing, scrolling, screenshot, and JavaScript tools without API keys or server-side inference.

#llm #gemma #webgpu

LLM Reddit Apr 6, 2026 2 min read

Reddit showcases Parlor, a real-time local voice-and-vision assistant powered by Gemma 4 E2B

A LocalLLaMA demo pointed to Parlor, which runs speech and vision understanding with Gemma 4 E2B and uses Kokoro for text-to-speech, all on-device. The README reports roughly 2.5-3.0 seconds end-to-end latency and about 83 tokens/sec decode speed on an Apple M3 Pro.

#llm #multimodal #edge-ai

LLM Reddit Apr 6, 2026 2 min read

LocalLLaMA digs into Gemma 4 Per-Layer Embeddings and why the small models behave differently

A LocalLLaMA explainer argues that Gemma 4 E2B/E4B gain their efficiency from Per-Layer Embeddings. The key point is that many of those parameters behave more like large token lookup tables than always-active compute-heavy layers, which changes the inference trade-off.

#llm #gemma #inference

LLM X/Twitter Apr 5, 2026 1 min read

Together Research says LLMs can repair bad database query plans

Together Research says LLMs can patch faulty database query plans instead of regenerating them from scratch, and claims up to 4.78x speedups on some TPC-H and TPC-DS workloads. The tweet points to DBPlanBench, a DataFusion-based harness that exposes a physical operator graph to an LLM and uses iterative search to refine plan edits.

#together-ai #dbplanbench #query-optimization

LLM Reddit Apr 5, 2026 2 min read

LocalLLaMA debates Gemma 4 31B's surprising FoodTruck Bench result

A LocalLLaMA thread highlighted Gemma 4 31B's unexpectedly strong FoodTruck Bench showing, and the discussion quickly turned to long-horizon planning quality and benchmark reliability.

#llm #gemma #benchmarks

LLM Hacker News Apr 5, 2026 2 min read

HN discusses Anthropic's claim that emotion concepts inside an LLM can shape behavior

Anthropic's new interpretability paper argues that emotion-related internal representations in Claude Sonnet 4.5 causally shape behavior, especially under stress.

#llm #interpretability #anthropic

LLM Hacker News Apr 5, 2026 2 min read

HN thread spotlights a simple self-distillation recipe for stronger code generation

A high-ranking Hacker News thread amplified Apple's paper on simple self-distillation for code generation, a training recipe that improves pass@1 without verifier models or reinforcement learning.

#llm #code-generation #self-distillation

LLM Reddit Apr 3, 2026 2 min read

Reddit Spotlights Stanford's Open CS25 Transformers Course for Spring 2026

Stanford's public CS25 course is again operating as an open lecture stream for Transformer research, with Zoom access, recordings, and a community layer that extends beyond campus.

#transformers #stanford #education

101

LLM Hacker News Apr 3, 2026 1 min read

Hacker News Highlights Lemonade as a Local AI Server for GPUs and NPUs

Lemonade packages local AI inference behind an OpenAI-compatible server that targets GPUs and NPUs, aiming to make open models easier to deploy on everyday PCs.

#local-ai #llm #gpu