Google launches Gemma 4 open models with Apache 2.0 licensing and up to 256K context

Original: Today, we’re launching Gemma 4, our most intelligent open models to date. Built with the same breakthrough technology as Gemini 3, Gemma 4 brings advanced reasoning to your personal hardware and devices. Here’s what Gemma 4 unlocks for developers: — Intelligence-per-parameter: Our 31B (Dense) and 26B (MoE) models deliver state-of-the-art performance for their size, outcompeting models 20x their size on @arena — Commercial flexibility: Released under a permissive Apache 2.0 license for complete developer flexibility and digital sovereignty — Agentic workflows: Native support for function-calling and structured JSON output allows you to build reliable, autonomous agents — Multimodal edge AI: The E2B and E4B models bring native vision, audio, and low latency to mobile and IoT devices — Long-context reasoning: Up to 256K context windows allow you to process entire repositories or large documents in a single prompt Whether you’re building global applications in 140+ languages or local-first AI code assistants, Gemma 4 is built to be your foundation. Explore in @GoogleAIStudio or download the weights on @HuggingFace, @Kaggle, and @Ollama. View original →

Read in other languages: 한국어日本語
LLM Apr 2, 2026 By Insights AI 2 min read 1 views Source

What Google launched

On April 2, 2026, Google introduced Gemma 4, describing it as its most capable open model family so far. The company says Gemma 4 is built from the same research and technology base as Gemini 3, but packaged for developers who want advanced reasoning and agentic workflows on their own hardware rather than only through hosted proprietary systems.

The launch is notable because Google is not positioning Gemma 4 as a small experimental side branch. It is explicitly presenting the family as a serious open-model platform with both edge and workstation targets, plus a commercially permissive Apache 2.0 license. That combination matters for teams that care about both model capability and deployment control.

What the Gemma 4 family includes

Google says it is releasing four sizes: E2B, E4B, 26B Mixture of Experts, and 31B Dense. The company argues the larger models deliver frontier-level performance for their size, while the smaller edge-oriented models emphasize multimodality, low latency, and practical on-device use.

  • Google says the 31B model ranks as the #3 open model on the Arena AI text leaderboard and the 26B model ranks #6, with Gemma 4 outperforming models 20x its size.
  • The company says the family supports function-calling, structured JSON output, and native system instructions for agentic workflows.
  • Context windows reach 128K on the edge models and up to 256K on the larger models.
  • Google says Gemma 4 is natively trained on 140+ languages.

Why the release matters beyond benchmarks

Google also attached ecosystem and adoption signals to the release. The company said developers have downloaded previous Gemma models over 400 million times and built more than 100,000 variants in the broader Gemmaverse. By tying Gemma 4 to that distribution base, Google is signaling that this is not just a research checkpoint. It is trying to reinforce Gemma as a durable open-model stack with day-one support across AI Studio, Hugging Face, Ollama, NVIDIA NIM, llama.cpp, vLLM, and other popular tooling.

An inference from the launch is that Google wants to compete more aggressively on the layer between closed frontier APIs and fully self-managed local deployment. Gemma 4 is being marketed as something developers can tune, ship, and run across phones, laptops, workstations, and accelerators while still getting modern agent features like tool use and long context. That is a stronger statement than simply releasing another small open model for hobbyist experimentation.

Where the real test will be

The clear caveat is that the strongest performance framing comes from Google’s own launch materials and leaderboard references. Real-world developer adoption will depend on how well Gemma 4 performs across downstream tasks, hardware budgets, and local-serving stacks. Even so, the release is high-signal because it combines an open license, serious model sizes, long context, agentic workflow features, and broad deployment flexibility in one coordinated launch. For teams deciding whether open models can handle more production-grade work, that package is difficult to ignore.

Sources: Google AI X post · Google blog

Share: Long

Related Articles

LLM sources.twitter Mar 11, 2026 2 min read

NVIDIA AI Developer introduced Nemotron 3 Super on March 11, 2026 as an open 120B-parameter hybrid MoE model with 12B active parameters and a native 1M-token context window. NVIDIA says the model targets agentic workloads with up to 5x higher throughput than the previous Nemotron Super model.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.