NVIDIA Releases Nemotron 3 Nano Omni: Open 30B Multimodal Model With 9x Higher Throughput
Open Multimodal AI for Agents
NVIDIA launched Nemotron 3 Nano Omni on April 28, 2026, available immediately via Hugging Face, OpenRouter, build.nvidia.com, and over 25 partner platforms.
Technical Specifications
- Architecture: 30B-A3B hybrid MoE with Conv3D and EVS
- Context: 256K tokens
- Modalities: Video, audio, image, and text in a single model
- Throughput: 9x higher than comparable open omni models
Designed for Multimodal Agents
Traditional multimodal pipelines require separate systems for vision, speech, and language — introducing latency and complexity. Nemotron 3 Nano Omni integrates these into one model, suited for agents that need to process multiple input types simultaneously without switching between systems.
Early Adoption
Early adopters include Aible, Applied Scientific Intelligence, Eka Care, Foxconn, H Company, Palantir, and Pyler. Dell Technologies, Docusign, Infosys, Oracle, and Zefr are evaluating the model.
Source: NVIDIA Blog
Related Articles
HN treated Ghostty’s GitHub exit as more than a forge move. What hit people was the subtext: when even a maintainer with deep GitHub history decides the relationship is no longer worth it, reliability and focus stop sounding like background complaints.
The HN reaction was closer to “what exactly got opened?” than “nice, another voice model.” VibeVoice brings long-form ASR and realtime TTS back into view, but the thread focused first on prior code removals and the difference between a repo, a paper, and something you can actually run.
HN read Zig's anti-AI contribution rule as a maintainer-time policy: review is for growing trusted humans, and LLM-shaped PRs break that loop.
Comments (0)
No comments yet. Be the first to comment!