Gemini 3 Deep Think Expands From Benchmarks to Science and Engineering Workflows

What Google announced

On February 12, 2026, Google published a major update to Gemini 3 Deep Think, its specialized reasoning mode. The company positions this release as a model update aimed at harder science, research, and engineering tasks rather than only consumer chatbot use. According to the post, the rollout starts in the Gemini app for Google AI Ultra subscribers, and selected organizations can request early access through the Gemini API program.

Claimed benchmark gains

Google reported several headline numbers for the updated Deep Think mode: 48.4% on Humanity’s Last Exam (without tools), 84.6% on ARC-AGI-2 (stated as verified by the ARC Prize Foundation), Codeforces Elo 3455, and gold-medal-level performance on International Math Olympiad 2025 tasks. The article also claims stronger science performance, including gold-medal-level results on written sections of the 2025 International Physics Olympiad and Chemistry Olympiad, plus 50.5% on CMT-Benchmark for theoretical physics.

Early tester examples

The post includes concrete pilot stories: Rutgers mathematician Lisa Carbone reportedly used Deep Think to identify a subtle logical flaw in a technical math paper; Duke University’s Wang Lab used it to design crystal-growth recipes and reached thin-film targets above 100 μm; and a Google hardware R&D lead tested the system for physical component design tasks. These are all vendor-reported examples, but they show where Google is trying to position Deep Think: mixed scientific reasoning plus practical engineering output.

Why the HN community reacted strongly

The Hacker News thread for this item reached more than one thousand points and hundreds of comments at crawl time, signaling strong interest in reproducibility, benchmark validity, and access controls for advanced reasoning models. The core takeaway is that Google is now combining benchmark signaling with workflow distribution through app and API channels, which is likely to shape competition around enterprise and research adoption in 2026.

Gemini 3 Deep Think Expands From Benchmarks to Science and Engineering Workflows

What Google announced

Claimed benchmark gains

Early tester examples

Why the HN community reacted strongly

Related Articles

Google AI Developers Announces Gemini 3.1 Flash-Lite Preview

GPT-5.5 jumps 3 points clear on Artificial Analysis, but cost rises 20%

Sakana Fugu Opens Beta With 54.2 SWE-Pro and OpenAI-Style API

Comments (0)

Leave a Comment

Related Articles

Google AI Developers Announces Gemini 3.1 Flash-Lite Preview
LLM sources.twitter Mar 5, 2026 1 min read

GPT-5.5 jumps 3 points clear on Artificial Analysis, but cost rises 20%

Sakana Fugu Opens Beta With 54.2 SWE-Pro and OpenAI-Style API