Google DeepMind、Gemini Deep Thinkをscientific research workflowへ拡大

Google DeepMindはFebruary 11, 2026の研究記事で、Gemini Deep ThinkがOlympiad級benchmarkの成功を超え、professional research workflowへ入り始めていると説明した。記事冒頭では、expert mathematicians and scientistsの指導のもとで、Gemini Deep Thinkがmathematics、physics、computer scienceのresearch problemを解いていると述べている。これは抽象的なreasoning能力のデモではなく、実際の科学研究のボトルネックに踏み込もうとする段階であることを示している。

その前提としてGoogle DeepMindは、2025年夏にadvanced version of Gemini Deep ThinkがInternational Mathematics Olympiadでgold-medal standardを達成し、その後International Collegiate Programming Contestでも同水準の成果を得たと整理している。今回の記事では、その次の段階として、直近で公開した2本の論文をもとに、science、engineering、enterprise workflowにおける、よりopen-endedな問題へ展開していると説明した。

Aletheiaとverification-firstの研究 loop

特に重要なのが、Gemini Deep Thinkを基盤とするmath research agentのAletheiaだ。Google DeepMindによると、Aletheiaはnatural-language verifierを使ってcandidate solutionの欠陥を見つけ、生成、修正、再検証を繰り返す。さらに重要なのは、問題を解けないときにfailureを認められる点だ。研究では誤った自信を避けること自体が大きな価値を持つため、この設計は実務的な意味が大きい。加えてAletheiaはGoogle Searchとweb browsingを使い、文献調査でspurious citationやcomputational inaccuracyを減らすよう設計されている。

性能面では、January 2026 versionのGemini Deep Thinkがinference-time computeの増加に伴ってIMO-ProofBench Advancedでup to 90%に達したとGoogle DeepMindは報告している。さらにAletheiaは、基盤モデル単体よりも低いinference-time computeでより高いreasoning qualityを示したという。これらの結果はhuman expertsが採点したと明記されており、自動評価だけに依存していない点も重要だ。

18件の研究課題で見えた応用範囲

記事では、algorithms、machine learning and combinatorial optimization、information theory、economicsなどにまたがる18件の研究問題について、専門家と協力したと述べている。具体例としては、cosmic stringsからのgravitational radiation計算に現れるsingularitiesを扱うため、GeminiがGegenbauer polynomialsを使ってinfinite seriesをclosed-form finite sumへまとめたphysicsの事例が紹介された。Google DeepMindによれば、成果の約半分はstrong conferenceを狙うもので、その中にはICLR '26 acceptanceも含まれる。

今回の発表が示す大きな方向性は、agentic reasoningとhuman verificationを組み合わせることで、general foundation modelがscientific workflowの実用的な協働相手になりつつあるという点だ。Google DeepMindは、Geminiがknowledge retrievalとrigorous verificationを担い、人間の研究者がconceptual depthとcreative directionへより集中できるようにする "force multiplier" になり得ると述べる。完全なautonomous scienceではないが、AIをgeneric chat interfaceからdisciplined research companionへ進める具体的な一歩として注目に値する。

Google DeepMind、Gemini Deep Thinkをscientific research workflowへ拡大

Aletheiaとverification-firstの研究 loop

18件の研究課題で見えた応用範囲

Related Articles

r/singularityが注目したAnthropicの「AI grad student」physics実験と率直なfailure mode

Aletheia論文がHNで注目、数学研究エージェントの実装像を提示

Anthropic、Claude Scienceから自社創薬へ踏み込む構え

Related Articles

r/singularityが注目したAnthropicの「AI grad student」physics実験と率直なfailure mode
Sciences Reddit Mar 24, 2026 1 min read

Aletheia論文がHNで注目、数学研究エージェントの実装像を提示
Sciences Hacker News Feb 16, 2026 1 min read

Anthropic、Claude Scienceから自社創薬へ踏み込む構え