A FutureSearch incident transcript moved quickly through Hacker News because it showed, minute by minute, how a poisoned LiteLLM package reached a workstation and was isolated within 72 minutes.
LLM
RSS FeedGitHub now lets users mention <code>@copilot</code> in a pull request to request changes on that same PR. The company says Copilot coding agent handles the work in a cloud development environment, runs tests and linting, then pushes updates; pull requests from forks are not yet supported.
Vercel introduced a rebuilt v0 positioned for production apps and agents rather than demo-only prototyping. The release adds repo import into a sandbox runtime, git-native branch and pull-request workflows, secure Snowflake and AWS database integrations, and enterprise-grade security controls.
Cloudflare said on March 24, 2026 that Dynamic Workers let developers execute AI-generated code inside secure, lightweight isolates and that the approach is 100 times faster than traditional containers. Cloudflare’s blog says the feature is now in open beta for paid Workers users and can block direct outbound internet access with <code>globalOutbound: null</code>.
Google DeepMind said on March 26, 2026 that Gemini 3.1 Flash Live is rolling out in preview via the Live API in Google AI Studio. Google’s blog says the model is designed for real-time voice and vision agents, improves tool triggering in noisy environments, and supports more than 90 languages for multimodal conversations.
A r/LocalLLaMA thread spread reports that NVIDIA could spend $26 billion over five years on open-weight AI models, but the real discussion centered on strategy rather than headline alone. NVIDIA’s March 2026 Nemotron 3 Super release gives the clearest evidence that the company wants open models, tooling, and Blackwell-optimized deployment to move together.
Vercel said on March 25, 2026 that its Custom Reporting API for AI Gateway is now in beta for Pro and Enterprise plans. Vercel's blog says teams can query cost, token usage, and request volume across AI Gateway traffic, including BYOK requests, and break results down by model, provider, user ID, tags, and credential type.
Anthropic said on March 25, 2026 that Claude Code auto mode uses classifiers to replace many permission prompts while remaining safer than fully skipping approvals. Anthropic's engineering post says the system combines a prompt-injection probe with a two-stage transcript classifier and reports a 0.4% false-positive rate on real traffic in its end-to-end pipeline.
A LocalLLaMA post claiming that Liquid AI’s LFM2-24B-A2B can run at roughly 50 tokens per second in a browser on an M4 Max reached 79 points and 11 comments. Community interest centered on sparse MoE architecture, ONNX packaging, and whether WebGPU can make the browser a credible local AI deployment target.
An independent Claude Code dashboard says its since-launch view now covers more than 20.8 million observed commits, over 1.08 million active repositories, and 114,785 new original repositories in the last seven days. Hacker News drove the link to 274 points and 164 comments as users debated what metrics can actually capture AI coding adoption.
ngrok’s March 25, 2026 explainer lays out how quantization can make LLMs roughly 4x smaller and 2x faster, and what the real 4-bit versus 8-bit tradeoff looks like. Hacker News drove the post to 247 points and 46 comments, reopening the discussion around memory bottlenecks and the economics of local inference.
GitHub said on March 25, 2026 that Copilot Free, Pro, and Pro+ interaction data will be used for model training from April 24 unless users opt out. Hacker News pushed the post to 303 points and 143 comments, focusing attention on privacy, defaults, and the split between individual and business plans.