부식 중

NVIDIA, Rubin 플랫폼 출시... 추론 비용 10배↓·훈련 GPU 4배↓

Read in other languages: 日本語

AI Feb 13, 2026 By Insights AI 1 min read 21 views Source

This article is not available in your selected language. Showing the original version.

Rubin 플랫폼, 2026년 하반기 출시

NVIDIA가 차세대 AI 플랫폼 Rubin을 발표했다. Rubin 기반 제품은 2026년 하반기부터 파트너사를 통해 출시될 예정이며, 현재 완전 생산(full production) 단계에 있다.

Blackwell 대비 극적인 성능 개선

Rubin 플랫폼은 하드웨어와 소프트웨어의 극단적인 공동 설계(extreme codesign)를 통해 다음을 달성했다:

추론 토큰 비용 10배 절감: Blackwell 대비 추론 비용 대폭 감소
MoE 모델 훈련 GPU 4배 감소: Mixture-of-Experts 모델 훈련에 필요한 GPU 수를 1/4로 줄임
6개의 새로운 칩: Rubin GPU, Grace CPU, 그리고 네트워킹 칩 포함

주요 클라우드 파트너

2026년 Vera Rubin 기반 인스턴스를 최초로 배치할 클라우드 제공업체:

메가 클라우드: AWS, Google Cloud, Microsoft, OCI
NVIDIA Cloud 파트너: CoreWeave, Lambda, Nebius, Nscale
서버 제조사: Cisco, Dell, HPE, Lenovo, Supermicro

소비자 GPU는 2026년 건너뛴다

한편 NVIDIA는 2026년 게이밍 GPU 신제품 출시를 건너뛸 것으로 알려졌다. RTX 50 Super 및 RTX 60 시리즈가 메모리 부족과 수익성 차이로 인해 연기된다.

AI 칩의 이익률은 65%인 반면 그래픽 카드는 40%에 불과해, NVIDIA는 AI 생산에 집중하는 전략적 전환을 단행했다.

AI 인프라 시장 주도권 강화

Rubin 플랫폼 출시는 NVIDIA가 AI 인프라 시장에서의 압도적 우위를 2026년 이후에도 유지할 것임을 보여준다. 특히 추론 비용 절감은 LLM 서비스 제공 업체들에게 게임 체인저가 될 전망이다.

출처: NVIDIA Newsroom, TrendForce

#nvidia #rubin #gpu #ai-hardware #inference

Share: Long

Related Articles

AI Feb 12, 2026 1 min read

NVIDIA Vera Rubin Platform Launches with 75% GPU Reduction for MoE, 10x Inference Cost Cut

NVIDIA unveiled its next-generation AI platform Vera Rubin at CES 2026, reducing GPUs needed for MoE model training by 4x and slashing inference token costs by 10x, with availability in H2 2026.

#nvidia #rubin #gpu

69

AI 5d ago 1 min read

Google rents 110,000 GPUs from SpaceX as Gemini demand strains capacity

Google will pay SpaceX $920M per month from October 2026 through June 2029 for access to about 110,000 NVIDIA GPUs and related compute. The deal shows how fast AI demand can pressure even one of the world’s largest infrastructure operators.

#google #spacex #ai-compute

6

AI Feb 20, 2026 2 min read

NVIDIA Details Rubin-Era DGX SuperPOD Blueprint for Next-Generation AI Factories

NVIDIA outlined a Rubin-based DGX SuperPOD architecture that combines compute, networking, and operations software as one deployment stack. The company claims up to 10x lower inference token cost versus the prior generation and targets availability in the second half of 2026.

#nvidia #rubin #dgx-superpod

96