Why it matters: AI infrastructure is moving from single accelerator rentals to managed clusters that resemble supercomputers. Google Cloud said A4X Max bare-metal instances support up to 50,000 GPUs and twice the network bandwidth of earlier generations.
#gpus
RSS FeedAWS said on March 16, 2026 that it is expanding its NVIDIA collaboration from chips and networking to software, data movement, and Amazon Bedrock model services. The companies plan more than 1 million GPUs across AWS regions beginning in 2026 and are adding new Blackwell, Nemotron, and NIXL integrations aimed at production AI workloads.
Meta said on February 24, 2026 that it had signed a long-term AI infrastructure agreement with AMD covering up to 6GW of AMD Instinct GPUs. The deal also aligns product roadmaps across chips, systems, and software, signaling a deeper attempt to diversify Meta’s AI compute stack.
At KubeCon Europe, NVIDIA moved its GPU Dynamic Resource Allocation driver into the CNCF and upstream Kubernetes ecosystem. The company also tied the donation to confidential containers support, KAI Scheduler progress, and new tools for large-scale AI cluster orchestration.
SkyPilot says Claude Code ran about 910 autoresearch experiments in 8 hours, and Hacker News focused on whether the real breakthrough was agent strategy, infrastructure, or both.
Meta said its long-term AMD agreement will provide up to 6GW of AMD Instinct GPU capacity for AI infrastructure. First shipments are planned for the second half of 2026 on Helios rack-scale systems.
Meta announced a multi-year infrastructure partnership with AMD, targeting up to 6GW of AMD Instinct GPU capacity for AI workloads. The agreement also aligns roadmaps across silicon, systems, and software, with first deployments expected in the second half of 2026.