Mathematicians Challenge AI: Show Us Your Proof Work

The Challenge

Leading mathematicians have launched an unprecedented exam called "First Proof" to test whether artificial intelligence can solve genuine, previously unsolved mathematical problems. The initiative addresses growing concerns about unverified claims from AI companies regarding mathematical breakthroughs.

Why This Matters

The mathematical community has become skeptical of recent AI achievements. Andrew Sutherland from MIT noted: "These are brand-new problems that cannot be found in any LLM's training data." This ensures AI cannot simply retrieve existing solutions from its training materials.

Past AI accomplishments raised red flags. One startup's celebrated proof turned out to be a misrepresented literature search result. Additionally, most published papers on AI mathematics come from the companies producing the AI systems themselves, creating an appearance of self-promotion rather than independent verification.

The Test Structure

Eleven mathematical experts, including a Fields Medal winner, contributed unsolved problems from their research. The exam focuses on "lemmas"—small theorems that mathematicians prove while working toward larger results—which represent more realistic applications of AI in daily mathematical work.

Crucially, encrypted proofs were submitted beforehand, with decryption scheduled for February 13, ensuring answers cannot be fabricated after the fact. The participating AI systems have one week to solve these problems.

Future Potential

Rather than solving landmark open problems, mathematicians view AI's near-term value as accelerating tedious research components, potentially making mathematical investigation more efficient across the field.

Mathematicians Challenge AI: Show Us Your Proof Work

The Challenge

Why This Matters

The Test Structure

Future Potential

Related Articles

Google DeepMind's Aletheia Autonomously Solves 6 Research-Level Math Problems

Anthropic and Mozilla Detail 22 Firefox Vulnerabilities Found by Claude

Researchers Warn That 'Shadow APIs' Are Undermining LLM Reproducibility

Comments (0)

Leave a Comment

Related Articles

Google DeepMind's Aletheia Autonomously Solves 6 Research-Level Math Problems
AI Reddit Mar 3, 2026 1 min read

Anthropic and Mozilla Detail 22 Firefox Vulnerabilities Found by Claude

Researchers Warn That 'Shadow APIs' Are Undermining LLM Reproducibility