A LocalLLaMA thread spotlights FlashAttention-4, which reports up to 1605 TFLOPs/s on B200 BF16 and introduces pipeline and memory-layout changes tuned for Blackwell constraints.
#blackwell
In a February 12, 2026 post, NVIDIA said major inference providers are reducing token costs with open-source frontier models on Blackwell. The article includes partner-reported gains across healthcare, gaming, and enterprise support workloads.
NVIDIA’s February 18, 2026 update outlines how it is supporting IndiaAI Mission priorities through GPU infrastructure expansion, sovereign model development, and research/startup programs. The post ties government policy goals to specific cloud, model, and financing collaborations.
NVIDIA announced on February 17, 2026 that Meta is scaling AI infrastructure using GB300 NVL72 systems, RTX PRO servers, Spectrum-X Ethernet, and Mission Control software. The move extends Meta’s large Hopper footprint into a broader Blackwell-era operations model.
NVIDIA’s February 16, 2026 update cites SemiAnalysis InferenceX data indicating major efficiency gains for GB300 NVL72 versus Hopper in agentic AI inference. The company also said Microsoft, CoreWeave, and OCI are deploying GB300 NVL72 for low-latency and long-context workloads.