Google DeepMind AI Co-Mathematician Cracks Five Ramsey Number Records Unsolved for Decades
Google DeepMind published its AI Co-Mathematician system, a multi-agent framework based on Gemini that improved lower bounds on five Ramsey numbers unsolved for up to 20 years — and set a new high on the FrontierMath Tier 4 benchmark.
FrontierMath Tier 4: 48% — A New Record
The AI Co-Mathematician scored 48% on FrontierMath Tier 4, a benchmark of research-level mathematical problems designed to remain beyond AI capabilities for decades. This surpasses all previously published AI systems on the benchmark.
Five Ramsey Number Records Broken
AlphaEvolve improved lower bounds for five classical Ramsey numbers in a single run:
- R(3,13): 60 → 61 (previous record held for 11 years)
- R(3,18): 99 → 100 (previous record held for 20 years)
- R(4,13): 138 → 139, R(4,14): 147 → 148, R(4,15): 158 → 159
How It Works
Unlike standard LLM-based math solvers, AlphaEvolve generates and iteratively improves algorithms using an evolutionary search strategy. It maintains a population of candidate programs and uses language models to mutate the most promising solutions. Across more than 50 open mathematical problems, it rediscovered state-of-the-art solutions 75% of the time and improved on them in 20% of cases. The paper is available at arXiv:2605.06651.
Related Articles
Google DeepMind spin-off Isomorphic Labs published a technical report on IsoDDE, a proprietary drug discovery AI that scientists are comparing to a hypothetical AlphaFold 4. The model excels at predicting protein-drug binding and has secured billion-dollar deals with J&J, Eli Lilly, and Novartis.
UCLA researchers have identified DDL-920, the first drug to fully reproduce the effects of physical stroke rehabilitation in model mice. The findings, published in Nature Communications, could transform stroke recovery into a pharmacological option.
Google DeepMind published new results on February 11, 2026 showing Gemini Deep Think workflows for mathematics, physics, and computer science research. The post outlines two new papers, evaluation benchmarks, and agent-assisted verification methods.
Comments (0)
No comments yet. Be the first to comment!