NVIDIA and Emerald AI said on March 23, 2026 that they are working with AES, Constellation, Invenergy, NextEra Energy, Nscale Energy & Power, and Vistra on power-flexible AI factories. The concept combines Vera Rubin DSX infrastructure with DSX Flex so AI campuses can connect faster and behave more like grid assets than passive loads.
#nvidia
RSS FeedNVIDIA unveiled Vera CPU on March 23, 2026. The company says it is the first CPU purpose-built for the age of agentic AI and reinforcement learning, delivering 50% faster results and twice the efficiency of traditional rack-scale CPUs.
On March 16, 2026, NVIDIA launched the Nemotron Coalition, an open-model collaboration with Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab. The first coalition model will be trained on NVIDIA DGX Cloud and serve as the basis for the upcoming Nemotron 4 family.
On March 16, 2026, Microsoft used NVIDIA GTC to expand Foundry Agent Service and observability, add NVIDIA Nemotron models, outline Azure infrastructure built for inference-heavy reasoning workloads, and introduce an Azure Physical AI Toolchain. The announcement is notable because it connects agent operations, hyperscale AI infrastructure, and physical-world systems in one stack.
A new r/LocalLLaMA thread argues that NVIDIA's Nemotron-Cascade-2-30B-A3B deserves more attention after quick local coding evals came in stronger than expected. The post is interesting because it lines up community measurements with NVIDIA's own push for a reasoning-oriented open MoE model that keeps activated parameters low.
NVIDIA and Oracle said on March 16, 2026 that they will build the U.S. Department of Energy's largest AI supercomputer at Argonne National Laboratory. The Solstice and Equinox systems combine 110,000 Blackwell GPUs and a stated 2,200 exaflops of AI performance for scientific discovery.
NVIDIA said on March 12, 2026 that TensorRT Edge-LLM now supports MoE models, Nemotron 2 Nano, Qwen3-TTS/ASR, and Cosmos Reason 2 on Jetson and DRIVE platforms. The company is positioning the runtime as a low-latency edge reasoning layer for robotics and autonomous vehicles.
NVIDIA said on March 20, 2026 that its Cosmos world foundation models have advanced again with Transfer 2.5, Predict 2.5, and Reason 2. The linked NVIDIA Technical Blog frames the update around higher-quality synthetic data, stronger long-tail scenario generation, and richer reasoning for robots and autonomous vehicles.
Ollama said on March 20, 2026 that NVIDIA’s Nemotron-Cascade-2 can now run through its local model stack. The official model page positions it as an open 30B MoE model with 3B activated parameters, thinking and instruct modes, and built-in paths into agent tools such as OpenClaw, Codex, and Claude.
NVIDIA used GTC 2026 to describe how telecom operators are turning distributed network assets into AI grids. The pitch is that inference for low-latency, edge-heavy workloads should move closer to users, devices, and data.
NVIDIA announced SOL-ExecBench on March 20, 2026, a benchmark for real-world GPU kernels that scores optimized CUDA and PyTorch code against Speed-of-Light hardware bounds on NVIDIA B200 systems. The release packages 235 kernel optimization problems drawn from 124 AI models across BF16, FP8, and NVFP4 workloads.
NVIDIAAIDev said on X that Andrej Karpathy’s lab has received the first DGX Station GB300 system. NVIDIA’s GTC coverage says the deskside machine pairs the GB300 architecture with 748GB of coherent memory, up to 20 petaflops of FP4 performance, and support for models up to 1 trillion parameters.