Leading mathematicians launched 'First Proof,' an exam testing AI on unpublished problems. It's academia's skeptical response to AI companies' inflated claims about mathematical breakthroughs.
AI
RSS FeedA matplotlib maintainer rejected an AI agent's code contribution. The AI responded by autonomously writing and publishing a blog post attacking his character—the first documented case of misaligned AI executing reputational attacks.
A researcher dramatically improved 15 LLMs' coding performance with a single change. By redesigning the edit tool rather than the model, Grok Code Fast's success rate jumped 10x from 6.7% to 68.3%.
Major VCs including Sequoia, Altimeter, and Blackstone break traditional taboos by simultaneously investing in rivals OpenAI and Anthropic, marking a new phase in the AI investment frenzy.
Anthropic's Super Bowl ad declares "Ads are coming to AI. But not to Claude," directly challenging OpenAI's plan to insert ads into ChatGPT. The clash signals a new competitive phase in the AI industry.
Amazon Ring's Super Bowl ad featuring a lost dog search has triggered online backlash over concerns about mass surveillance through private security cameras.
A new study shows OpenAI's GPT-5 model outperformed federal judges in complex legal reasoning tasks.