HN Spotlight: Sarvam Open-Sources 30B and 105B in a Full-Stack IndiaAI Push

Hacker News picked up Sarvam AI’s March 6 announcement that it is open-sourcing Sarvam 30B and Sarvam 105B, two reasoning-oriented models the company says were trained from scratch in India on compute provided under the IndiaAI mission. The underlying company post frames the release as more than a model drop: Sarvam is presenting a full stack that spans data curation, training, inference optimization, tokenizer work, and product deployment.

The technical shape is fairly specific. Both models use a sparse MoE Transformer backbone with 128 experts. Sarvam 30B uses Grouped Query Attention to keep KV-cache usage practical, while Sarvam 105B uses Multi-head Latent Attention to push memory efficiency further on longer contexts. Sarvam says the 30B model was trained on 16T tokens and the 105B on 12T tokens, with mixtures covering code, web data, math, multilingual content, and synthetic data. The company also emphasizes a tokenizer optimized for 22 scheduled Indian languages across 12 scripts.

The benchmark claims are what pushed the HN post into wider circulation. Sarvam 105B is positioned as a competitive open model for reasoning, coding, and agentic work, with posted scores such as 71.7 on LiveCodeBench v6, 90.6 on MMLU, 88.3 Pass@1 on AIME 25, and 68.3 on Tau2 average. Sarvam 30B is presented as the efficiency-first deployment option, with 2.4B active parameters and strong scores on HumanEval, MBPP, BrowseComp, and Tau2. Sarvam says both models are already in production: 30B powers Samvaad and 105B powers Indus.

What makes the release notable is the operational story around it. The post spends significant space on fused kernels, scheduling, disaggregated serving, and throughput claims across H100, L40S, and Apple Silicon. In other words, Sarvam is not only publishing weights; it is arguing that open models become more relevant when the inference stack is tuned for real workloads and regional language coverage.

For builders, the practical takeaway is straightforward. This is a sovereign-model effort that tries to compete on reasoning quality, agentic utility, and serving efficiency at the same time. HN’s interest reflects that broader question: whether regional model labs can differentiate by owning the full pipeline rather than just matching headline parameter counts.

Primary source: Sarvam AI’s release post.

HN Spotlight: Sarvam Open-Sources 30B and 105B in a Full-Stack IndiaAI Push

Related Articles

Mistral introduces Mistral Small 4, a unified open-source reasoning and multimodal model

Hacker News debates a no-training LLM trick that duplicates layers to improve reasoning

Semble: Open-Source Code Search for AI Agents That Uses 98% Fewer Tokens

Related Articles

Mistral introduces Mistral Small 4, a unified open-source reasoning and multimodal model
LLM Mar 29, 2026 1 min read

Hacker News debates a no-training LLM trick that duplicates layers to improve reasoning
LLM Hacker News Mar 23, 2026 2 min read

Semble: Open-Source Code Search for AI Agents That Uses 98% Fewer Tokens
LLM Hacker News May 18, 2026 1 min read