NVIDIA Unveils Next-Gen AI Platform Rubin — Six Chips and AI Supercomputer
Rubin Platform Launch
NVIDIA unveiled its next-generation AI platform Rubin at CES 2026 in January. The platform comprises six new chips designed to deliver an AI supercomputer.
The core of the Rubin platform is the Vera Rubin superchip. This chip combines one Vera CPU and two Rubin GPUs in a single processor, delivering 5x improved inference performance compared to NVIDIA GB200 NVL72 rack systems.
Performance Metrics
Specific performance figures include:
- Inference performance per chip: 50 PF NVFP4
- Inference performance per rack: 3.6 EF NVFP4
- Performance improvement over GB200: 5x
These performance improvements enable real-time inference for large language models and generative AI applications, significantly improving the cost-efficiency of AI services.
Major Adopters
The Rubin platform has already secured adoption from major AI industry players:
Cloud providers: AWS, Microsoft Azure, Google Cloud, CoreWeave, Lambda
AI labs and startups: OpenAI, Anthropic, Mistral AI, Cohere, Black Forest Labs, Harvey, Cursor
Enterprise companies: Meta, Cisco
Hardware manufacturers: Dell Technologies, HPE, Lenovo
Notably, Microsoft Azure will deploy NVIDIA Vera Rubin NVL72 rack-scale systems scaling to hundreds of thousands of NVIDIA Vera Rubin Superchips in its next-generation Fairwater AI superfactories.
Microsoft's Strategic Integration
Microsoft plans to deeply integrate the Rubin platform into its AI infrastructure. Azure will offer a tightly optimized platform enabling customers to accelerate innovation.
Interestingly, even as Microsoft develops its own AI chip (Maia 200), it continues its partnership with NVIDIA. This is interpreted as a diversification strategy in the AI infrastructure market.
Intensifying AI Infrastructure Competition
The Rubin platform announcement demonstrates increasingly fierce competition in the AI infrastructure market. While NVIDIA remains the dominant player, Microsoft (Maia), Google (TPU), and Amazon (Trainium/Inferentia) are accelerating their own chip development.
Nevertheless, NVIDIA's CUDA ecosystem and software stack continue to provide a strong competitive advantage, which the Rubin platform is expected to further solidify.
Market Outlook
2026 is expected to see AI infrastructure investment peak. Alphabet recently issued a $20 billion bond for data centers, custom silicon, and AI service scaling, demonstrating how much capital Big Tech is investing in AI infrastructure buildout.
The Rubin platform is projected to be a key beneficiary of this massive investment.
Sources
Related Articles
NVIDIA outlined a Rubin-based DGX SuperPOD architecture that combines compute, networking, and operations software as one deployment stack. The company claims up to 10x lower inference token cost versus the prior generation and targets availability in the second half of 2026.
NVIDIA unveiled its next-gen AI platform Rubin, delivering 10x reduction in inference token cost and 4x fewer GPUs for MoE model training vs. Blackwell. Launch planned for H2 2026.
NVIDIA unveiled its next-generation AI platform Vera Rubin at CES 2026, reducing GPUs needed for MoE model training by 4x and slashing inference token costs by 10x, with availability in H2 2026.
Comments (0)
No comments yet. Be the first to comment!