Google Releases Gemini 3.1 Pro: 77.1% ARC-AGI-2, Doubled Reasoning Performance
Google DeepMind Launches Gemini 3.1 Pro
Google DeepMind officially launched Gemini 3.1 Pro on February 19, 2026, marking a significant advancement in multimodal reasoning and agentic AI capabilities. The model is the latest in the Gemini 3 series.
Benchmark Results
The model achieves 77.1% on ARC-AGI-2—more than doubling its predecessor Gemini 3 Pro's score of 31.1%. On SWE-Bench Verified, it scores 80.6%, and on GPQA Diamond (graduate-level science reasoning), it reaches 94.3%. Terminal-Bench 2.0 performance stands at 68.5%, and Humanity's Last Exam (with tools) scores 51.4%.
Technical Specifications
The model supports a 1 million token input context window, enabling processing of over 1,500 pages of text or entire code repositories in a single prompt. Maximum output is 65,536 tokens. Gemini 3.1 Pro natively handles text, images, audio, video, and code in unified multimodal workflows.
Key Improvements
A new granular thinking mode lets users select reasoning depth from minimal to high, balancing speed and accuracy. Hallucination rates dropped significantly from 88% to 50% on the AA-Omniscience benchmark. Agentic reliability has been improved for autonomous multi-step workflows.
Pricing and Access
Pricing remains unchanged: $2/1M tokens (input up to 200K), $4/1M (input beyond 200K), $12/1M (output up to 200K), $18/1M (output beyond 200K). Available through Google AI Studio, Vertex AI, Gemini API, NotebookLM, and Microsoft Foundry. Deep Think mode is available to Google AI Ultra subscribers.
Related Articles
Google's Gemini 3.1 Pro achieves 77.1% on ARC-AGI-2—more than doubling the previous Gemini 3 Pro's score. The mid-cycle upgrade brings Deep Think-level reasoning capabilities to all users and developers.
Google AI Developers has released Android Bench, an official leaderboard for LLMs on Android development tasks. In the first results, Gemini 3.1 Pro ranks first, and Google is also publishing the benchmark, dataset, and test harness.
Google DeepMind said on March 3, 2026 that Gemini 3.1 Flash-Lite delivers faster performance at a lower price than Gemini 2.5 Flash. Google is rolling the model out in preview via Google AI Studio and Vertex AI for high-volume, latency-sensitive workloads.
Comments (0)
No comments yet. Be the first to comment!