Open-Weight LLM Race (May 2026): Mistral, Poolside, Gemma, and Grok

4 articles Updated May 8, 2026 #benchmark #open-source #product-launch #api

Current state

A burst of open-weight model releases in the first week of May 2026: Mistral Medium 3.5 (128B unifying chat, reasoning, and coding), Poolside's Laguna XS.2 (68.2% SWE-bench on a single GPU), Gemma 4 Multi-Token Prediction drafters for faster inference, and xAI's Grok 4.3 topping agentic tool-calling benchmarks.

What changed recently

xAI Launches Grok 4.3 on API: Tops Agentic Tool Calling Benchmarks
Google Releases Multi-Token Prediction Drafters for Gemma 4: Up to 3x Speedup
Poolside Releases Laguna XS.2: First Open-Weight Coding Model That Runs on a Single GPU

Key tensions

Optimistic case: Open-Weight LLM Race (May 2026): Mistral, Poolside, Gemma, and Grok unlocks real, compounding leverage.

Skeptical case: reliability, cost, and control around Open-Weight LLM Race (May 2026): Mistral, Poolside, Gemma, and Grok remain unresolved.

Signals to watch

Momentum and new coverage around “benchmark”
Momentum and new coverage around “open-source”
Momentum and new coverage around “product-launch”

Timeline

Latest

LLM X/Twitter May 8, 2026 1 min read

xAI Launches Grok 4.3 on API: Tops Agentic Tool Calling Benchmarks

xAI has released Grok 4.3 on its API, claiming top spots on agentic tool calling and instruction-following leaderboards, and ranking #1 in enterprise domains such as case law and corporate finance. It supports a 1M token context window at $1.25/M input and $2.50/M output.

#xai #grok #grok-4.3

Recent development

LLM Reddit May 6, 2026 1 min read

Google Releases Multi-Token Prediction Drafters for Gemma 4: Up to 3x Speedup

Google has released Multi-Token Prediction (MTP) draft models for the Gemma 4 family, achieving up to 3x inference speedup through speculative decoding without any loss in output quality.

#gemma #google #mtp

Recent development

LLM May 5, 2026 1 min read

Poolside Releases Laguna XS.2: First Open-Weight Coding Model That Runs on a Single GPU

Poolside AI released Laguna XS.2 on April 28, 2026 under Apache 2.0 — a 33B total/3B active MoE model purpose-built for agentic coding, scoring 68.2% on SWE-bench Verified and deployable on a single consumer GPU.

#open-source #coding #benchmark

Recent development

LLM May 5, 2026 1 min read

Mistral Medium 3.5: A Single 128B Open-Weight Model That Replaces Three Separate Models

Released April 29, 2026 under Modified MIT license, Mistral Medium 3.5 consolidates the company's chat, reasoning, and coding models into one 128B dense open-weight model with 256K context, scoring 77.6% on SWE-bench Verified.

#mistral #open-source #benchmark

Share: Long