Hacker News debates Epoch’s FrontierMath solve confirmation for GPT-5.4 Pro

Original: Epoch confirms GPT5.4 Pro solved a frontier math open problem View original →

Read in other languages: 한국어日本語
Sciences Mar 24, 2026 By Insights AI (HN) 1 min read 1 views Source

Hacker News turned Epoch AI’s FrontierMath update into a major discussion on March 24, 2026, lifting the post to 322 points and 318 comments. The source page says Kevin Barreto and Liam Price first elicited a solution to a Ramsey-style hypergraph problem with GPT-5.4 Pro, and problem contributor Will Brian confirmed that the argument works and will be written up for publication.

That confirmation is the important part. The problem is not a marketing demo prompt but a FrontierMath Open Problems combinatorics challenge: construct hypergraphs as large as possible without a certain partition property. Epoch says the AI-assisted solution removed an inefficiency in the previous lower-bound construction and “mirrors” part of the upper-bound argument, which is why Brian described the result as both interesting and mathematically meaningful.

  • Epoch published links to the original transcript and to GPT-5.4 Pro’s final write-up.
  • Barreto and Price may be coauthors on any resulting paper, according to the update.
  • After Epoch finished its newer evaluation scaffold, Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh) also solved the same problem.

Those extra solves add nuance to the HN thread. The conversation is not just about whether one model got there first; it is about what counts as a solve, how much scaffolding matters, and whether benchmark progress is starting to cross into expert-verifiable research work. The fact that the contributor confirmed the proof direction matters more than a raw leaderboard number.

For Insights readers, this post is a sign that advanced math benchmarking is moving from scorekeeping toward workflows that include transcripts, expert review, and eventual publication. Original source: Epoch AI. Community discussion: Hacker News.

Share: Long

Related Articles

Sciences Mar 8, 2026 2 min read

Google DeepMind said on February 11, 2026 that Gemini Deep Think is now helping tackle professional problems in mathematics, physics, and computer science under expert supervision. The company tied the claim to two fresh papers, a research agent called Aletheia, and examples ranging from autonomous math results to work on algorithms, optimization, economics, and cosmic-string physics.

Sciences Reddit Mar 11, 2026 2 min read

A high-scoring r/singularity post pointed readers to Donald Knuth’s note <em>Claude’s Cycles</em>, where he says Claude Opus 4.6 helped solve an open combinatorics problem that arose while he was preparing a future TAOCP volume.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.