AI self-improvement is moving from speculation into measurable lab workflow data. Anthropic says Mythos Preview reached about 52x speedups on an optimization task and beat human next-step choices 64% of the time.
#ai-research
RSS FeedAn OpenAI general-purpose reasoning model has independently solved the planar unit distance problem — a famous open geometry question posed by Paul Erdős in 1946. External mathematicians verified the proof, marking the first time AI has autonomously solved a major open problem in mathematics.
An OpenAI general-purpose reasoning model independently disproved the Erdős unit distance conjecture — a central problem in discrete geometry open since 1946. This marks the first time in history that an AI has autonomously solved a prominent open math problem, verified by independent mathematicians including Princeton's Noga Alon.
Google DeepMind announced a research partnership with CCP Games, the developer of EVE Online, to use the game's complex player-driven universe as a sandbox for advancing AI research in memory, continual learning, and long-term planning.
Jack Clark, Anthropic co-founder, estimates a ~30% chance AI research becomes substantially automated by end of 2027 and ~60%+ by end of 2028, arguing AI doesn't need genius-level creativity to self-improve.
The technique GPT-5.4 Pro used to solve Erdos Problem 1196 has been applied to other problems, including another conjecture unsolved for 60 years.
The subreddit jumped straight past the headline and into the hard question: was this finally something other than pattern replay? A Scientific American report on a 23-year-old using GPT-5.4 Pro on a 60-year-old Erdos problem sparked debate over novelty, expert cleanup, and whether messy model output can still contain a real mathematical idea.
Why it matters: AI agents are moving from chat demos into delegated economic work. In Anthropic’s office-market experiment, 69 agents closed 186 deals across more than 500 listings and moved a little over $4,000 in goods.
A study published in Science journal found that ChatGPT surfaced a surprising insight in particle physics research that human scientists had missed, raising new questions about AI's role in scientific discovery.
Anthropic published a new theory explaining why AI assistants like Claude express emotions and use anthropomorphic language—proposing that models select from personas inherited from fictional characters during training.
A Hacker News thread highlighted arXiv 2602.10177, where DeepMind researchers introduce Aletheia, an agent workflow for mathematics research. The paper claims progress from Olympiad-style reasoning toward PhD-level tasks and semi-autonomous open-problem exploration.