Sciences Hacker News 1d ago 2 min read
HN did not treat the Erdős headline as proof of autonomous math genius; the thread kept circling back to expert cleanup, problem selection, and whether the new method generalizes.
HN did not treat the Erdős headline as proof of autonomous math genius; the thread kept circling back to expert cleanup, problem selection, and whether the new method generalizes.
OpenAI released proof attempts for all 10 First Proof problems and said expert feedback suggests at least five may be correct. The company positioned the result as a test of long-horizon reasoning beyond standard benchmarks.
OpenAI published five model-generated submissions to the First Proof math challenge. None were accepted as valid solutions, but the release gives researchers direct evidence of where frontier reasoning systems succeed and fail.