Hacker News debates Epoch’s FrontierMath solve confirmation for GPT-5.4 Pro

Hacker News turned Epoch AI’s FrontierMath update into a major discussion on March 24, 2026, lifting the post to 322 points and 318 comments. The source page says Kevin Barreto and Liam Price first elicited a solution to a Ramsey-style hypergraph problem with GPT-5.4 Pro, and problem contributor Will Brian confirmed that the argument works and will be written up for publication.

That confirmation is the important part. The problem is not a marketing demo prompt but a FrontierMath Open Problems combinatorics challenge: construct hypergraphs as large as possible without a certain partition property. Epoch says the AI-assisted solution removed an inefficiency in the previous lower-bound construction and “mirrors” part of the upper-bound argument, which is why Brian described the result as both interesting and mathematically meaningful.

Epoch published links to the original transcript and to GPT-5.4 Pro’s final write-up.
Barreto and Price may be coauthors on any resulting paper, according to the update.
After Epoch finished its newer evaluation scaffold, Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh) also solved the same problem.

Those extra solves add nuance to the HN thread. The conversation is not just about whether one model got there first; it is about what counts as a solve, how much scaffolding matters, and whether benchmark progress is starting to cross into expert-verifiable research work. The fact that the contributor confirmed the proof direction matters more than a raw leaderboard number.

For Insights readers, this post is a sign that advanced math benchmarking is moving from scorekeeping toward workflows that include transcripts, expert review, and eventual publication. Original source: Epoch AI. Community discussion: Hacker News.

Sciences Hacker News Apr 28, 2026 2 min read

HN cares less that ChatGPT hit an Erdős problem than how it got there

HN read this math story less as another "AI did it" headline and more as a case where a model pointed at a route humans had not tried. The part that stuck was the expert cleanup work after the GPT-5.4 Pro draft, not the one-shot prompt itself.

#openai #gpt-5.4 #mathematics

Sciences Reddit Apr 29, 2026 2 min read

r/singularity read the new Erdos proof as a test of whether LLMs can make a genuinely new move

The subreddit jumped straight past the headline and into the hard question: was this finally something other than pattern replay? A Scientific American report on a 23-year-old using GPT-5.4 Pro on a 60-year-old Erdos problem sparked debate over novelty, expert cleanup, and whether messy model output can still contain a real mathematical idea.

#mathematics #gpt-5.4 #erdos-problems

Sciences 5d ago 2 min read

BMS turns eight Vera Rubin racks into a drug-discovery AI factory

Bristol Myers Squibb is adding a second DGX SuperPOD built on eight DGX Vera Rubin NVL72 systems. The move turns AI infrastructure from a specialist resource into a shared platform for researchers across the company’s global drug-discovery pipeline.

#bms #nvidia #drug-discovery

Related Articles

HN cares less that ChatGPT hit an Erdős problem than how it got there

r/singularity read the new Erdos proof as a test of whether LLMs can make a genuinely new move

BMS turns eight Vera Rubin racks into a drug-discovery AI factory