MiniMax M3 weights hit Hugging Face with 428B total parameters

MiniMax M3 is now a concrete open-weight release rather than only a benchmark post. The MiniMax official account posted on June 12, 2026 at 14:11 UTC that the weights were live on Hugging Face and linked the MiniMax Sparse Attention paper.

The tweet’s key figure was ~428B parameters and ~23B activated parameters. FxTwitter showed more than 528,000 views, 2,485 likes, and 301 reposts. The quoted earlier post adds the benchmark frame: 59.0% on SWE-Bench Pro, 66.0% on Terminal Bench 2.1, 34.8% on SWE-fficiency, 28.8% on KernelBench Hard, and 74.2% on MCP Atlas.

The Hugging Face model card describes MiniMax-M3 as a native multimodal model with a 1M-token context window. It says MiniMax Sparse Attention improves long-context efficiency, with 9x prefill and 15x decode speedups over M2 at 1M context and per-token compute reduced to 1/20. The card also points users to SGLang, vLLM, and Transformers deployment paths.

MiniMax’s official account usually publishes model, API, and agent product updates. This post is material because it changes access: researchers and builders can inspect weights and try supported serving stacks. The next checks are license constraints, real serving cost, independent long-context quality tests, and whether the coding-agent benchmark claims survive third-party evaluation. NVIDIA AI’s same-day note about a free GPU-accelerated endpoint may also broaden early testing. Source tweet

LLM Reddit Apr 12, 2026 1 min read

LocalLLaMA Flags MiniMax M2.7 as Open Weights, Not Open Source, Because of Its License

A popular r/LocalLLaMA thread argues that MiniMax M2.7 should be treated as an open-weights release with a restricted license, not as open source, because commercial use requires prior written authorization.

#minimax #open-weights #licensing

LLM Hacker News Jul 16, 2026 2 min read

Inkling shifts the open-weight question toward fine-tuning

HN readers focused less on leaderboard dominance and more on the package: Thinking Machines Lab is offering a multimodal MoE with controllable reasoning effort and Tinker-based fine-tuning as an open-weight base.

#thinking-machines #open-weights #multimodal

LLM X/Twitter Jul 17, 2026 1 min read

Thinking Machines opens Inkling weights for multimodal reasoning

Open-weight multimodal models just gained a serious new entrant. Thinking Machines released Inkling with full weights, 64K and 256K context options, and a direct fine-tuning path through Tinker.

#thinking-machines #inkling #open-weights

Related Articles

LocalLLaMA Flags MiniMax M2.7 as Open Weights, Not Open Source, Because of Its License

Inkling shifts the open-weight question toward fine-tuning

Thinking Machines opens Inkling weights for multimodal reasoning