A developer on r/MachineLearning shared phase-one details for Dante-2B, a 2.1B Italian/English model trained from scratch with a tokenizer tuned for Italian morphology and token efficiency.
#llm
RSS FeedAnthropic's April 7, 2026 security write-up for Claude Mythos Preview argues that frontier LLM gains are now translating into real exploit-development capability. Hacker News is treating the post as a sign that defensive tooling and offensive risk are accelerating together.
A recent Show HN post highlighted GuppyLM, a tiny education-first language model trained on 60K synthetic conversations with a deliberately simple transformer stack. The project stands out because readers can inspect and run the whole pipeline in Colab or directly in the browser.
A LocalLLaMA post drew attention to PokeClaw, an open-source Android prototype that runs Gemma 4 locally through LiteRT-LM and lets the model tap, swipe, type, open apps, send messages, and manage auto-replies without cloud inference.
HN picked up Nanocode, an open JAX project that packages tokenizer training, pretraining, synthetic data generation, agentic SFT, and DPO into an end-to-end recipe for building a coding model on TPU infrastructure.
A Show HN thread highlighted Gemma Gem, a Chrome extension that runs Gemma 4 locally via WebGPU and exposes page-reading, clicking, typing, scrolling, screenshot, and JavaScript tools without API keys or server-side inference.
A LocalLLaMA demo pointed to Parlor, which runs speech and vision understanding with Gemma 4 E2B and uses Kokoro for text-to-speech, all on-device. The README reports roughly 2.5-3.0 seconds end-to-end latency and about 83 tokens/sec decode speed on an Apple M3 Pro.
A LocalLLaMA explainer argues that Gemma 4 E2B/E4B gain their efficiency from Per-Layer Embeddings. The key point is that many of those parameters behave more like large token lookup tables than always-active compute-heavy layers, which changes the inference trade-off.
A Show HN thread highlighted GuppyLM, a tiny 8.7M-parameter transformer with a 60K synthetic conversation dataset and Colab notebooks. The point is not state-of-the-art performance, but making the full LLM pipeline inspectable from data generation to inference.
Together Research says LLMs can patch faulty database query plans instead of regenerating them from scratch, and claims up to 4.78x speedups on some TPC-H and TPC-DS workloads. The tweet points to DBPlanBench, a DataFusion-based harness that exposes a physical operator graph to an LLM and uses iterative search to refine plan edits.
A LocalLLaMA thread highlighted Gemma 4 31B's unexpectedly strong FoodTruck Bench showing, and the discussion quickly turned to long-horizon planning quality and benchmark reliability.
Anthropic's new interpretability paper argues that emotion-related internal representations in Claude Sonnet 4.5 causally shape behavior, especially under stress.