Mistral introduces Mistral Small 4, a unified open-source reasoning and multimodal model

Mistral announced Mistral Small 4 on March 16, 2026 as the first model in the Mistral Small family to combine the company’s reasoning, multimodal, and agentic coding capabilities in one open release. The practical pitch is simple: developers and enterprises no longer need to switch between separate models for chat, coding, and image-aware reasoning.

According to the launch post, the model uses a Mixture of Experts architecture with 128 experts, 4 active per token, 119B total parameters, and 6B active parameters per token. Mistral also highlights a 256k context window and native support for both text and image input, which positions the model for long-document analysis, multimodal assistants, and more complex agent workflows.

Released under an Apache 2.0 license
New reasoning_effort control for trading off latency and deeper step-by-step reasoning
Mistral claims a 40% reduction in end-to-end completion time and 3x higher request throughput versus Mistral Small 3
Available through Mistral API, AI Studio, Hugging Face, and NVIDIA NIM on day one

The launch matters because open-model buyers increasingly want one deployable system that can cover multiple workloads without a complex routing layer. A long context window, multimodal input, and controllable reasoning usually mean making tradeoffs across different models. Mistral is arguing that Small 4 can collapse those tradeoffs into a single adaptable model while keeping deployment open and customizable.

Mistral also frames efficiency as a core differentiator. In its published benchmarks, the company says Small 4 with reasoning matches or surpasses GPT-OSS 120B on the cited tests while producing shorter outputs, which would translate into lower latency and reduced inference cost if it holds up in real deployments. That makes Mistral Small 4 one of the more important open-model launches of March for teams that care about reasoning, coding, and multimodal work in the same stack.

Mistral introduces Mistral Small 4, a unified open-source reasoning and multimodal model

Related Articles

HN Spotlight: Sarvam Open-Sources 30B and 105B in a Full-Stack IndiaAI Push

Hacker News debates a no-training LLM trick that duplicates layers to improve reasoning

Semble: Open-Source Code Search for AI Agents That Uses 98% Fewer Tokens

Related Articles

HN Spotlight: Sarvam Open-Sources 30B and 105B in a Full-Stack IndiaAI Push
LLM Hacker News Mar 7, 2026 2 min read

Hacker News debates a no-training LLM trick that duplicates layers to improve reasoning
LLM Hacker News Mar 23, 2026 2 min read

Semble: Open-Source Code Search for AI Agents That Uses 98% Fewer Tokens
LLM Hacker News May 18, 2026 1 min read