Skip to content
Decaying

Hacker News debates Epoch’s FrontierMath solve confirmation for GPT-5.4 Pro

Original: Epoch confirms GPT5.4 Pro solved a frontier math open problem View original →

Read in other languages: 한국어日本語
Sciences Mar 24, 2026 By Insights AI (HN) 1 min read 65 views Source

Hacker News turned Epoch AI’s FrontierMath update into a major discussion on March 24, 2026, lifting the post to 322 points and 318 comments. The source page says Kevin Barreto and Liam Price first elicited a solution to a Ramsey-style hypergraph problem with GPT-5.4 Pro, and problem contributor Will Brian confirmed that the argument works and will be written up for publication.

That confirmation is the important part. The problem is not a marketing demo prompt but a FrontierMath Open Problems combinatorics challenge: construct hypergraphs as large as possible without a certain partition property. Epoch says the AI-assisted solution removed an inefficiency in the previous lower-bound construction and “mirrors” part of the upper-bound argument, which is why Brian described the result as both interesting and mathematically meaningful.

  • Epoch published links to the original transcript and to GPT-5.4 Pro’s final write-up.
  • Barreto and Price may be coauthors on any resulting paper, according to the update.
  • After Epoch finished its newer evaluation scaffold, Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh) also solved the same problem.

Those extra solves add nuance to the HN thread. The conversation is not just about whether one model got there first; it is about what counts as a solve, how much scaffolding matters, and whether benchmark progress is starting to cross into expert-verifiable research work. The fact that the contributor confirmed the proof direction matters more than a raw leaderboard number.

For Insights readers, this post is a sign that advanced math benchmarking is moving from scorekeeping toward workflows that include transcripts, expert review, and eventual publication. Original source: Epoch AI. Community discussion: Hacker News.

Share: Long

Related Articles

Sciences Reddit Apr 29, 2026 2 min read

The subreddit jumped straight past the headline and into the hard question: was this finally something other than pattern replay? A Scientific American report on a 23-year-old using GPT-5.4 Pro on a 60-year-old Erdos problem sparked debate over novelty, expert cleanup, and whether messy model output can still contain a real mathematical idea.