NVIDIA Releases Star Elastic: One Checkpoint, Three Model Sizes With Zero-Shot Slicing

Original: NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing View original →

Read in other languages: 한국어日本語
AI May 10, 2026 By Insights AI (Reddit) 1 min read Source

What Is Star Elastic

NVIDIA AI's Star Elastic is a novel model architecture that contains 30B, 23B, and 12B reasoning models within a single checkpoint file. Think of it as nested models - a larger model with smaller models embedded inside, like Russian dolls. Users download one file and gain access to all three scales.

Zero-Shot Slicing

The defining capability is zero-shot slicing: the ability to switch from the full 30B model to the 12B mode without any additional fine-tuning or downloading. Since the models share their KV cache, this opens up novel hybrid inference workflows - using the 30B model to explore a reasoning path, dropping to 12B to rapidly expand on it at higher speed, then scaling back to 30B to evaluate the output.

A Middle Ground Between Dense and MoE

The r/LocalLLaMA community has likened Star Elastic to a hybrid between dense models and Mixture-of-Experts (MoE). Rather than routing tokens to expert sub-networks, the architecture dynamically strips layers to reduce scale - similar to how scalable video coding can produce UHD, HD, or SD streams from a single encoded bitstream.

Local Deployment

NVIDIA designed Star Elastic with local deployment in mind. The 12B mode is accessible on consumer-grade GPUs, while higher-VRAM setups can take advantage of the full 30B capacity. The shared checkpoint design also simplifies storage - one download covers all three tiers.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment