#llm

LLM Reddit Apr 11, 2026 2 min read

Dante-2B pitches an Italian-first open model instead of an English-first fine-tune

A developer on r/MachineLearning shared phase-one details for Dante-2B, a 2.1B Italian/English model trained from scratch with a tokenizer tuned for Italian morphology and token efficiency.

#llm #tokenizer #multilingual

LLM Hacker News Apr 8, 2026 2 min read

Hacker News Tracks Claude Mythos Preview's Cybersecurity Leap

Anthropic's April 7, 2026 security write-up for Claude Mythos Preview argues that frontier LLM gains are now translating into real exploit-development capability. Hacker News is treating the post as a sign that defensive tooling and offensive risk are accelerating together.

#anthropic #cybersecurity #llm

LLM Hacker News Apr 7, 2026 2 min read

GuppyLM Turns LLM Training into a Readable 8.7M-Parameter Show HN Project

A recent Show HN post highlighted GuppyLM, a tiny education-first language model trained on 60K synthetic conversations with a deliberately simple transformer stack. The project stands out because readers can inspect and run the whole pipeline in Colab or directly in the browser.

#llm #education #pytorch

LLM Reddit Apr 6, 2026 2 min read

LocalLLaMA Showcases PokeClaw, a Fully On-Device Gemma 4 Agent for Android

A LocalLLaMA post drew attention to PokeClaw, an open-source Android prototype that runs Gemma 4 locally through LiteRT-LM and lets the model tap, swipe, type, open apps, send messages, and manage auto-replies without cloud inference.

#llm #android #gemma

LLM Hacker News Apr 6, 2026 2 min read

Hacker News Highlights Nanocode, a JAX/TPU Path to Train a Claude Code-Style Model for About $200

HN picked up Nanocode, an open JAX project that packages tokenizer training, pretraining, synthetic data generation, agentic SFT, and DPO into an end-to-end recipe for building a coding model on TPU infrastructure.

#llm #jax #tpu

LLM Hacker News Apr 6, 2026 2 min read

Hacker News Spots Gemma Gem, a Browser-Embedded Agent That Runs Gemma 4 With No Cloud

A Show HN thread highlighted Gemma Gem, a Chrome extension that runs Gemma 4 locally via WebGPU and exposes page-reading, clicking, typing, scrolling, screenshot, and JavaScript tools without API keys or server-side inference.

#llm #gemma #webgpu

LLM Reddit Apr 6, 2026 2 min read

Reddit showcases Parlor, a real-time local voice-and-vision assistant powered by Gemma 4 E2B

A LocalLLaMA demo pointed to Parlor, which runs speech and vision understanding with Gemma 4 E2B and uses Kokoro for text-to-speech, all on-device. The README reports roughly 2.5-3.0 seconds end-to-end latency and about 83 tokens/sec decode speed on an Apple M3 Pro.

#llm #multimodal #edge-ai

LLM Reddit Apr 6, 2026 2 min read

LocalLLaMA digs into Gemma 4 Per-Layer Embeddings and why the small models behave differently

A LocalLLaMA explainer argues that Gemma 4 E2B/E4B gain their efficiency from Per-Layer Embeddings. The key point is that many of those parameters behave more like large token lookup tables than always-active compute-heavy layers, which changes the inference trade-off.

#llm #gemma #inference

LLM Hacker News Apr 6, 2026 2 min read

Hacker News spots GuppyLM, an 8.7M-parameter teaching LLM you can train in minutes

A Show HN thread highlighted GuppyLM, a tiny 8.7M-parameter transformer with a 60K synthetic conversation dataset and Colab notebooks. The point is not state-of-the-art performance, but making the full LLM pipeline inspectable from data generation to inference.

#llm #education #open-source

LLM sources.twitter Apr 5, 2026 1 min read

Together Research says LLMs can repair bad database query plans

Together Research says LLMs can patch faulty database query plans instead of regenerating them from scratch, and claims up to 4.78x speedups on some TPC-H and TPC-DS workloads. The tweet points to DBPlanBench, a DataFusion-based harness that exposes a physical operator graph to an LLM and uses iterative search to refine plan edits.

#together-ai #dbplanbench #query-optimization

LLM Reddit Apr 5, 2026 2 min read

LocalLLaMA debates Gemma 4 31B's surprising FoodTruck Bench result

A LocalLLaMA thread highlighted Gemma 4 31B's unexpectedly strong FoodTruck Bench showing, and the discussion quickly turned to long-horizon planning quality and benchmark reliability.

#llm #gemma #benchmarks

LLM Hacker News Apr 5, 2026 2 min read

HN discusses Anthropic's claim that emotion concepts inside an LLM can shape behavior

Anthropic's new interpretability paper argues that emotion-related internal representations in Claude Sonnet 4.5 causally shape behavior, especially under stress.

#llm #interpretability #anthropic