OpenAI said on March 19, 2026 that it will acquire Astral, the company behind uv, Ruff, and ty. The move is meant to push Codex from code generation toward the broader Python development workflow.
LLM
RSS FeedA llama.cpp comparison on r/LocalLLaMA reached 55 upvotes and 81 comments. By testing RTX 5090, DGX Spark, AMD AI395, and single or dual R9700 setups under the same parameters, the post offers a practical view of local inference trade-offs that vendor slides usually hide.
A LocalLLaMA thread about Intel’s Arc Pro B70 and B65 reached 213 upvotes and 133 comments. Intel says the B70 is available from March 25, 2026 with a suggested starting price of $949, while the B65 follows in mid-April.
Google Research introduced TurboQuant on March 24, 2026 as a compression approach for KV cache and vector search bottlenecks. Hacker News pushed the post to 491 points and 129 comments, reflecting how central memory efficiency has become for long-context inference.
AWS and Cerebras said on March 13, 2026 that they are building a high-speed inference offering for Amazon Bedrock. The design splits prefill work to AWS Trainium and decode work to Cerebras CS-3 systems.
NVIDIA said on March 25, 2026 that Nemotron Nano 12B v2 VL delivers on-prem video understanding and, in NVIDIA's telling, performs near 30B-class alternatives on the MediaPerf benchmark at less than half the footprint. NVIDIA's model card describes it as a commercially usable multimodal model for multi-image reasoning, video understanding, visual Q&A, and summarization.
r/LocalLLaMA responded strongly to GigaChat 3.1 because the release spans a local-friendly 10B A1.8B MoE and a 702B frontier-scale MoE, both under MIT terms and both presented as trained from scratch.
Hacker News pushed Ente's Ensu announcement because it treats local LLM software as a privacy and ownership product: offline chat across major platforms, open source core logic, and planned encrypted sync.
Microsoft Research has open-sourced AgentRx, a framework for pinpointing the first critical failure in long AI-agent trajectories. It ships with a 115-trajectory benchmark and reports gains in both failure localization and root-cause attribution.
Cohere said on March 25, 2026 that it is partnering with RWS to bring its frontier AI models to Language Weaver Pro. RWS describes Language Weaver Pro as a 100+ B parameter translation system built in collaboration with Cohere and designed for secure, sensitive enterprise environments.
Anthropic said on March 24, 2026 that a new Engineering Blog post explains how it used a multi-agent harness to improve Claude on frontend design and long-running autonomous software engineering. The write-up separates planning, generation, and evaluation, and reports clear gains over simpler solo-agent runs.
r/artificial focused on ATLAS because it shows how planning, verification, and repair infrastructure can push a frozen 14B local model far closer to frontier coding performance.