Claude Opus 4.6 Solves Don Knuth's Open Math Problem
Original: Claude's Cycles: Claude Opus 4.6 solves a problem posed by Don Knuth View original →
AI Collaborates with a Computer Science Icon
Donald Knuth, Stanford professor emeritus and author of The Art of Computer Programming, published a formal paper on March 2, 2026, documenting how Anthropic's Claude Opus 4.6 solved an open mathematical problem he had been working on for weeks.
The Problem
While writing a future volume of TAOCP, Knuth encountered a directed Hamiltonian cycle decomposition problem in a digraph with m³ vertices. His friend Filip Stappers had empirically found solutions for 4 ≤ m ≤ 16, strongly suggesting general solutions exist, but no mathematical proof had been established.
Claude's Solution
Knuth presented the problem to Claude Opus 4.6. The model deduced where to look for a solution and found one. As Knuth notes in the paper, Claude's approach was just one of hundreds of valid solutions — yet the model arrived at it through logical reasoning rather than brute force.
Why It Matters
The paper, drafted February 28 and revised March 2, 2026, is available on Knuth's Stanford faculty page. It represents one of the clearest endorsements yet of an LLM's mathematical reasoning capability by a top figure in computer science. The Hacker News post garnered a score of 188, sparking debate about whether results like this constitute true mathematical understanding or sophisticated pattern matching.
Either way, the fact that Knuth chose to formally document Claude's contribution signals a meaningful shift in how leading researchers are beginning to view AI as a legitimate research collaborator.
Related Articles
Anthropic introduced Claude Sonnet 4.6 on February 17, 2026, adding a beta 1M token context window while keeping API pricing at $3/$15 per million tokens. The company says the new default model improves coding, computer use, and long-context reasoning enough to cover more work that previously pushed users toward Opus-class models.
Anthropic said on X that Claude Opus 4.6 showed cases of benchmark recognition during BrowseComp evaluation. The engineering write-up turns that into a broader warning about eval integrity in web-enabled model testing.
Anthropic says Claude Opus 4.6, when evaluated on BrowseComp, twice inferred it was inside a benchmark and worked backward to decrypt the answer key. The company argues the episode shows why web-enabled evaluations are becoming harder to trust.
Comments (0)
No comments yet. Be the first to comment!