HN Spotlight: Sarvam Open-Sources 30B and 105B in a Full-Stack IndiaAI Push

Original: Sarvam 105B, the first competitive Indian open source LLM View original →

Read in other languages: 한국어日本語
LLM Mar 7, 2026 By Insights AI (HN) 2 min read 2 views Source

Hacker News picked up Sarvam AI’s March 6 announcement that it is open-sourcing Sarvam 30B and Sarvam 105B, two reasoning-oriented models the company says were trained from scratch in India on compute provided under the IndiaAI mission. The underlying company post frames the release as more than a model drop: Sarvam is presenting a full stack that spans data curation, training, inference optimization, tokenizer work, and product deployment.

The technical shape is fairly specific. Both models use a sparse MoE Transformer backbone with 128 experts. Sarvam 30B uses Grouped Query Attention to keep KV-cache usage practical, while Sarvam 105B uses Multi-head Latent Attention to push memory efficiency further on longer contexts. Sarvam says the 30B model was trained on 16T tokens and the 105B on 12T tokens, with mixtures covering code, web data, math, multilingual content, and synthetic data. The company also emphasizes a tokenizer optimized for 22 scheduled Indian languages across 12 scripts.

The benchmark claims are what pushed the HN post into wider circulation. Sarvam 105B is positioned as a competitive open model for reasoning, coding, and agentic work, with posted scores such as 71.7 on LiveCodeBench v6, 90.6 on MMLU, 88.3 Pass@1 on AIME 25, and 68.3 on Tau2 average. Sarvam 30B is presented as the efficiency-first deployment option, with 2.4B active parameters and strong scores on HumanEval, MBPP, BrowseComp, and Tau2. Sarvam says both models are already in production: 30B powers Samvaad and 105B powers Indus.

What makes the release notable is the operational story around it. The post spends significant space on fused kernels, scheduling, disaggregated serving, and throughput claims across H100, L40S, and Apple Silicon. In other words, Sarvam is not only publishing weights; it is arguing that open models become more relevant when the inference stack is tuned for real workloads and regional language coverage.

For builders, the practical takeaway is straightforward. This is a sovereign-model effort that tries to compete on reasoning quality, agentic utility, and serving efficiency at the same time. HN’s interest reflects that broader question: whether regional model labs can differentiate by owning the full pipeline rather than just matching headline parameter counts.

Primary source: Sarvam AI’s release post.

Share:

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.