NVIDIA Vera Rubin Platform Launches with 75% GPU Reduction for MoE, 10x Inference Cost Cut

Vera Rubin Unveiled at CES 2026

NVIDIA announced its next-generation AI platform Vera Rubin at CES 2026. Rubin is a superchip combining one Vera CPU and two Rubin GPUs in a single processor, serving as the core of the six-chip Rubin platform.

Revolutionary Performance Gains

NVIDIA reports that the Rubin platform delivers the following improvements over Blackwell systems:

MoE Model Training: 4x reduction in the number of GPUs needed to train the same model (75% decrease)
Inference Token Costs: 10x reduction

This is particularly optimized for large-scale Mixture-of-Experts (MoE) models like GPT-4, Llama 4 Maverick, and DeepSeek V4.

Targeting Agentic AI and Reasoning Models

NVIDIA framed the Rubin platform as ideal for agentic AI, advanced reasoning models, and MoE models, reflecting the core trends in the AI industry for 2026.

Release Timeline and Partners

The Rubin platform is in full production, and Rubin-based products will be available from partners in the second half of 2026. Major cloud providers (AWS, Google Cloud, Microsoft Azure) and server manufacturers are preparing Rubin-based offerings.

Gaming GPU Hiatus in 2026

Meanwhile, NVIDIA reportedly does not plan to release a new graphics chip for gaming this year, marking the first time in 30 years that the company will skip a full calendar year without a significant GeForce refresh. This is due to a deepening global memory shortage, pushing NVIDIA to prioritize limited memory capacity for AI accelerators.

VibeTensor Open Source Release

NVIDIA also released VibeTensor, a PyTorch-style deep learning runtime whose implementation was generated by LLM coding agents. It is open-sourced under Apache 2.0 license, targeting Linux x86_64 with NVIDIA GPUs and CUDA as hard requirements.

NVIDIA Vera Rubin Platform Launches with 75% GPU Reduction for MoE, 10x Inference Cost Cut

Vera Rubin Unveiled at CES 2026

Revolutionary Performance Gains

Targeting Agentic AI and Reasoning Models

Release Timeline and Partners

Gaming GPU Hiatus in 2026

VibeTensor Open Source Release

Related Articles

NVIDIA, Rubin 플랫폼 출시... 추론 비용 10배↓·훈련 GPU 4배↓

Google, SpaceX GPU 110,000개에 월 $920M… AI 수요가 만든 임대전

NAVER, NVIDIA DSX로 세종 55MW AI 팩토리 확장