The Reddit thread zeroed in on a hard lesson for AI-written kernels: verifier success can miss optimizer- and data-dependent numerical failures.
#training
RSS FeedThe post promised a zero-state optimizer with low VRAM overhead, and r/MachineLearning answered the way that community usually does: show the rule, show more seeds, and bring harder tasks.
Why it matters: model launches live or die on serving and training support, not just weights. LMSYS says its Day-0 stack reached 199 tok/s on B200 and 266 tok/s on H200, while staying strong out to 900K context.
GitHub said that starting April 24, 2026, interaction data from Copilot Free, Pro, and Pro+ users will be used to train and improve AI models unless users opt out. Business and Enterprise plans are excluded, but the change materially expands how individual-tier Copilot usage can feed back into model development.
A March 17, 2026 r/MachineLearning post about Clip to Grok reached 56 points and 20 comments at crawl time. The authors report that per-row L2 clipping after each optimizer step cut grokking delay by 18x to 66x on modular arithmetic benchmarks.
A March 19, 2026 Hacker News post about NanoGPT Slowrun reached 162 points and 43 comments at crawl time. Q Labs says an ensemble of 1.8B-parameter models trained on 100M tokens matched a baseline that would normally require 1B tokens.
Q Labs says 100M tokens and an 18B-parameter ensemble can match a 1B-token baseline, and Hacker News immediately focused on whether that gain survives serving and deployment.
SkyPilot says Claude Code ran about 910 autoresearch experiments in 8 hours, and Hacker News focused on whether the real breakthrough was agent strategy, infrastructure, or both.
Google introduced AI Works for Europe, adding $30 million to the Google.org European AI Opportunity Fund and expanding AI training resources. The initiative combines worker training, university partnerships, and a new certificate rollout in ten European languages.
A March 15, 2026 r/MachineLearning post introduced preflight, a new PyTorch-oriented CLI that runs 10 pre-training checks such as label leakage, NaN detection, gradient checks, and VRAM estimation before a job starts.
OpenAI CEO Sam Altman responded to criticism over AI training energy costs by drawing a parallel to human education: becoming intelligent also requires 20 years and all the food energy consumed in that time.
A high-engagement Reddit post summarized 2025 ML competition patterns across major platforms. The author reports tracking roughly 400 contests and first-place solution details for 73, highlighting shifts in tooling, model choices, and compute budgets.