A Docker guide on running NanoClaw inside a Shell Sandbox reached 102 points on Hacker News, highlighting a practical pattern for isolating agent runtime, limiting filesystem exposure, and keeping API keys out of the guest environment.
LLM
A high-ranking r/singularity post shared Google’s Gemini 3 Deep Think update. The announcement includes benchmark claims such as 48.4% on Humanity’s Last Exam (without tools), 84.6% on ARC-AGI-2, and Codeforces Elo 3455, plus Gemini API early access.
A Hacker News post highlighted the SkillsBench paper, which evaluates agent skills across 86 tasks and 11 domains. Curated skills improved average pass rate substantially, while self-generated skills showed no average gain.
Google DeepMind announced Gemma Scope 2, extending open interpretability tooling to the full Gemma 3 family from 270M to 27B parameters. The company says the release involved roughly 110 Petabytes of stored data and over 1 trillion total trained parameters.
A high-engagement r/LocalLLaMA thread tracked the MiniMax-M2.5 release on Hugging Face. The model card emphasizes agentic coding/search benchmarks, runtime speedups, and aggressive cost positioning.
A Show HN post introduces Off Grid, an open-source Android/iOS app that runs chat, image generation, vision, and speech transcription entirely on-device without cloud data transfer.
OpenAI reports that, across more than one million ChatGPT conversations, the share of difficult interactions exceeding a human baseline increased roughly fourfold from September 2024 to January 2026. The company also shows large gains in case-interview and puzzle-style open tasks.
A Reddit thread amplified an Ars Technica report that Google detected a 100,000+ prompt extraction campaign against Gemini, reopening questions about distillation, defense, and IP boundaries.
A widely discussed Hacker News post compares Anthropic and OpenAI fast modes and argues that LLM speed gains are increasingly driven by serving architecture, not just model quality.
NIST’s CAISI released draft guidance NIST AI 800-2 for automated language-model benchmark evaluations and opened comments through March 31, 2026. The draft focuses on objective setting, execution methodology, and analysis/reporting quality.
OpenAI said on January 29, 2026 that ChatGPT would stop offering GPT-4o and older model options from February 13, 2026. GPT-4o, GPT-4.5, and o4-mini are being replaced by GPT-5, GPT-5 thinking, and o5-mini respectively.
Technical summary of "KaniTTS2 — open-source 400M TTS model with voice cloning, runs in 3GB VRAM. Pretrain code included.", a high-signal post from Reddit r/LocalLLaMA. Based on visible community indicators (score 456, comments 84), this article highlights practical checks before adoption.