Google DeepMind's Aletheia Autonomously Solves 6 Research-Level Math Problems

Beyond Math Competitions

Google DeepMind's Aletheia AI agent is demonstrating the ability to tackle genuine open problems in mathematics research — not just competition problems. A Reddit post in r/singularity (score: 291) highlighting this achievement sparked significant discussion about whether AI is approaching genuine mathematical research capability.

Key Achievements

FirstProof Challenge: Aletheia autonomously solved 6 out of 10 open research-level math problems according to majority expert assessment
Bloom's Erdős Conjectures: In a semi-autonomous evaluation of 700 open problems, Aletheia solved 4 open questions
Autonomous research paper: Generated a fully AI-authored paper calculating eigenweight structure constants in arithmetic geometry

How Aletheia Works

Aletheia is built on Gemini Deep Think and uses a three-part agentic harness: a Generator that proposes candidate solutions, a Verifier that checks for flaws, and a Reviser that corrects errors. This architecture improves with more inference-time compute — Gemini Deep Think now scores up to 90% on IMO-ProofBench Advanced, up from IMO Gold-medal level in July 2025.

Mathematical Community Recognition

Fields Medalist Terence Tao and other leading mathematicians have recognized the significance of these results, describing Aletheia as a 'valuable research collaborator.' While Aletheia still struggles with many problems, the successes represent a qualitative leap in AI-assisted research.

AI Mar 19, 2026 2 min read

Google DeepMind proposes a cognitive framework for measuring AGI progress

Google DeepMind said on March 17, 2026 that it has published a new cognitive-science framework for evaluating progress toward AGI and launched a Kaggle hackathon to turn that framework into practical benchmarks. The proposal defines 10 cognitive abilities, recommends comparison against human baselines, and puts $200,000 behind community-built evaluations.

#google-deepmind #agi #evaluation

AI 3d ago 2 min read

Vision Banana turns image generators into all-purpose vision models

This paper argues that image generators may be turning into the vision equivalent of large language models. DeepMind says Vision Banana, built on Nano Banana Pro, beats or rivals specialist systems such as Segment Anything and Depth Anything on 2D and 3D tasks after lightweight instruction tuning.

#google-deepmind #computer-vision #vision-banana

AI sources.twitter 4d ago 2 min read

Anthropic’s 81,000-user survey ties AI exposure to job anxiety

Why it matters: AI labor risk is moving from abstract forecasts into user-reported evidence. Anthropic analyzed 81,000 responses and found workers in high-exposure occupations were about 3x more likely to mention job displacement concerns.

#anthropic #ai-economics #survey