NVIDIA Launches Rubin Platform: 10x Lower Inference Cost, 4x Fewer Training GPUs

Read in other languages: 한국어日本語
AI Feb 13, 2026 By Insights AI 1 min read 8 views Source

Rubin Platform Ships H2 2026

NVIDIA announced its next-generation AI platform Rubin. Rubin-based products will be available from partners in the second half of 2026 and are currently in full production.

Dramatic Performance Gains Over Blackwell

The Rubin platform achieves the following through extreme codesign across hardware and software:

  • 10x reduction in inference token cost: Dramatically lower inference costs vs. Blackwell
  • 4x reduction in GPUs for MoE training: Trains Mixture-of-Experts models with 1/4 the GPU count
  • Six new chips: Includes Rubin GPU, Grace CPU, and networking chips

Key Cloud Partners

First cloud providers to deploy Vera Rubin-based instances in 2026:

  • Major clouds: AWS, Google Cloud, Microsoft, OCI
  • NVIDIA Cloud Partners: CoreWeave, Lambda, Nebius, Nscale
  • Server vendors: Cisco, Dell, HPE, Lenovo, Supermicro

Consumer GPUs Skipped in 2026

Meanwhile, NVIDIA will reportedly skip gaming GPU releases in 2026. The RTX 50 Super and RTX 60 series are delayed due to memory shortages and profit margin differences.

AI chips offer 65% profit margins vs. 40% for graphics cards, driving NVIDIA's strategic shift toward AI production.

Strengthening AI Infrastructure Dominance

The Rubin platform launch signals NVIDIA's continued dominance in AI infrastructure beyond 2026. The inference cost reduction will be particularly game-changing for LLM service providers.

Source: NVIDIA Newsroom, TrendForce

Share:

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.