Mistral launches Leanstral, an open-source code agent for Lean 4
Original: Leanstral: Open-Source foundation for trustworthy vibe-coding View original →
Mistral says Leanstral is the first open-source code agent designed specifically for Lean 4, the proof assistant used in formal verification and theorem proving. Announced on March 16, 2026, the model is presented as infrastructure for trustworthy vibe-coding rather than a general assistant retrofitted to formal methods after the fact.
The company’s pitch is centered on efficiency and repository realism. Mistral says Leanstral runs with 6B active parameters and was trained to operate in realistic formal repositories, not only on isolated contest-style math tasks. It also said it will publish a technical report and release FLTEval, a benchmark suite meant to evaluate formal reasoning beyond competition math.
- Purpose-built for Lean 4 workflows
- Released under an Apache 2.0 license
- Available inside Mistral Vibe and through the
labs-leanstral-2603API endpoint - Downloadable weights for self-hosted use
Mistral’s blog argues that Leanstral delivers better efficiency than much larger open-source peers on FLTEval. In the published comparison, the company says Leanstral reaches competitive scores with fewer passes and continues scaling as additional passes are allowed. That matters because formal methods tooling is usually constrained not just by raw capability, but by latency, compute cost, and the need to behave predictably inside structured repositories.
The broader significance is that formal verification is moving closer to mainstream developer tooling. By combining an open license, direct product integration, and a benchmark focused on real formal workloads, Mistral is signaling that verified code generation and proof-oriented programming are becoming a practical product category, not just a research demo. For teams working with theorem proving, protocol verification, or safety-critical software, Leanstral is a notable attempt to make that stack cheaper and easier to deploy.
Related Articles
Semble is an open-source code search library for AI agents that reduces token usage by 98% compared to grep+read, while achieving 99% of transformer model quality. It runs entirely on CPU with no external dependencies and integrates directly with Claude Code, Cursor, and Codex via MCP.
Alibaba's Qwen team has released Qwen3.7-Max, an agent-focused frontier LLM. It ranks 5th on Artificial Analysis's Intelligence Index, nearly matching GPT 5.4, and is available as both an API and open weights.
Forge is a new open-source Python framework that applies structured guardrails to self-hosted LLMs. The best config — Ministral-3 8B Q8 — jumps from a 53% baseline to 86.5% on the 26-scenario eval suite, with 99% achievable on agentic tasks.