Google DeepMind's Aletheia Autonomously Solves 6 Research-Level Math Problems

Beyond Math Competitions

Google DeepMind's Aletheia AI agent is demonstrating the ability to tackle genuine open problems in mathematics research — not just competition problems. A Reddit post in r/singularity (score: 291) highlighting this achievement sparked significant discussion about whether AI is approaching genuine mathematical research capability.

Key Achievements

FirstProof Challenge: Aletheia autonomously solved 6 out of 10 open research-level math problems according to majority expert assessment
Bloom's Erdős Conjectures: In a semi-autonomous evaluation of 700 open problems, Aletheia solved 4 open questions
Autonomous research paper: Generated a fully AI-authored paper calculating eigenweight structure constants in arithmetic geometry

How Aletheia Works

Aletheia is built on Gemini Deep Think and uses a three-part agentic harness: a Generator that proposes candidate solutions, a Verifier that checks for flaws, and a Reviser that corrects errors. This architecture improves with more inference-time compute — Gemini Deep Think now scores up to 90% on IMO-ProofBench Advanced, up from IMO Gold-medal level in July 2025.

Mathematical Community Recognition

Fields Medalist Terence Tao and other leading mathematicians have recognized the significance of these results, describing Aletheia as a 'valuable research collaborator.' While Aletheia still struggles with many problems, the successes represent a qualitative leap in AI-assisted research.

AI Mar 19, 2026 2 min read

Google DeepMind proposes a cognitive framework for measuring AGI progress

Google DeepMind said on March 17, 2026 that it has published a new cognitive-science framework for evaluating progress toward AGI and launched a Kaggle hackathon to turn that framework into practical benchmarks. The proposal defines 10 cognitive abilities, recommends comparison against human baselines, and puts $200,000 behind community-built evaluations.

#google-deepmind #agi #evaluation

AI X/Twitter Mar 26, 2026 2 min read

Google DeepMind releases a real-world toolkit to measure harmful AI manipulation

Google DeepMind said on March 26, 2026 that it is releasing research on how conversational AI might exploit emotions or manipulate people into harmful choices. The company says it built the first empirically validated toolkit to measure harmful AI manipulation, based on nine studies with more than 10,000 participants across the UK, the US, and India.

#google-deepmind #ai-safety #manipulation

105

AI Reddit Feb 23, 2026 1 min read

Demis Hassabis Proposes Definitive AGI Test: Could AI Discover General Relativity?

DeepMind CEO Demis Hassabis proposed a concrete AGI benchmark: train an AI with a knowledge cutoff of 1911, then see if it can independently derive general relativity as Einstein did in 1915. This test targets genuine scientific discovery rather than pattern matching.

#agi #deepmind #hassabis

105