A high-scoring r/singularity post pointed readers to Donald Knuth’s note <em>Claude’s Cycles</em>, where he says Claude Opus 4.6 helped solve an open combinatorics problem that arose while he was preparing a future TAOCP volume.
#mathematics
Google DeepMind said on February 11, 2026 that Gemini Deep Think is now helping tackle professional problems in mathematics, physics, and computer science under expert supervision. The company tied the claim to two fresh papers, a research agent called Aletheia, and examples ranging from autonomous math results to work on algorithms, optimization, economics, and cosmic-string physics.
Google DeepMind's Aletheia AI research agent solved 6 out of 10 open research-level math problems in the FirstProof Challenge as judged by expert mathematicians. The system also generated a fully autonomous research paper and solved 4 open conjectures from Bloom's Erdős database.
Anthropic's Claude Opus 4.6 independently solved a directed Hamiltonian cycle decomposition problem that computer science legend Donald Knuth had spent weeks working on. Knuth documented the achievement in a formal Stanford paper, marking one of the first times a top-tier computer scientist has formally credited an LLM with solving a genuine research problem.
A Hacker News thread highlighted arXiv 2602.10177, where DeepMind researchers introduce Aletheia, an agent workflow for mathematics research. The paper claims progress from Olympiad-style reasoning toward PhD-level tasks and semi-autonomous open-problem exploration.
Google DeepMind published new results on February 11, 2026 showing Gemini Deep Think workflows for mathematics, physics, and computer science research. The post outlines two new papers, evaluation benchmarks, and agent-assisted verification methods.
Leading mathematicians launched 'First Proof,' an exam testing AI on unpublished problems. It's academia's skeptical response to AI companies' inflated claims about mathematical breakthroughs.
Mathematicians launch a new mathematical proof challenge for AI systems, testing whether AI can not only provide answers but also clearly demonstrate the proof process.