NVIDIA Releases Star Elastic: One Checkpoint, Three Model Sizes With Zero-Shot Slicing

What Is Star Elastic

NVIDIA AI's Star Elastic is a novel model architecture that contains 30B, 23B, and 12B reasoning models within a single checkpoint file. Think of it as nested models - a larger model with smaller models embedded inside, like Russian dolls. Users download one file and gain access to all three scales.

Zero-Shot Slicing

The defining capability is zero-shot slicing: the ability to switch from the full 30B model to the 12B mode without any additional fine-tuning or downloading. Since the models share their KV cache, this opens up novel hybrid inference workflows - using the 30B model to explore a reasoning path, dropping to 12B to rapidly expand on it at higher speed, then scaling back to 30B to evaluate the output.

A Middle Ground Between Dense and MoE

The r/LocalLLaMA community has likened Star Elastic to a hybrid between dense models and Mixture-of-Experts (MoE). Rather than routing tokens to expert sub-networks, the architecture dynamically strips layers to reduce scale - similar to how scalable video coding can produce UHD, HD, or SD streams from a single encoded bitstream.

Local Deployment

NVIDIA designed Star Elastic with local deployment in mind. The 12B mode is accessible on consumer-grade GPUs, while higher-VRAM setups can take advantage of the full 30B capacity. The shared checkpoint design also simplifies storage - one download covers all three tiers.

AI May 3, 2026 1 min read

Pentagon Clears Seven AI Firms for Classified Military Networks—Anthropic Excluded

The U.S. Department of Defense struck agreements with seven tech companies to deploy AI on its highest-security networks on May 1. Anthropic, which insisted on safety guardrails against autonomous weapons, is conspicuously absent.

#openai #google #microsoft

AI Reddit 6d ago 1 min read

AMD Ryzen AI Max Pro 495 Leaks with 192GB Unified Memory — Local AI Gets a Big Upgrade

AMD's Ryzen AI Max Pro 495 (Gorgon Halo) has leaked with 192GB of unified memory, up 50% from the 128GB in the current Strix Halo. The upgrade would enable running significantly larger AI models locally without discrete GPU memory limits.

#amd #ryzen-ai #hardware

AI Reddit 4d ago 1 min read

Richard Dawkins Declares Claude Conscious After 3 Days, Community Pushes Back Hard

Evolutionary biologist Richard Dawkins spent 3 days conversing with Claude, named the instance 'Claudia,' and declared it conscious in UnHerd. His fluency argument — too good an output must mean consciousness — drew sharp criticism from the AI community.

#claude #ai-consciousness #richard-dawkins