A top Hacker News discussion tracked Google’s Gemini 3.1 Pro rollout. Google positions it as a stronger reasoning baseline, highlighting a 77.1% ARC-AGI-2 score and broad preview availability across developer, enterprise, and consumer channels.
#reasoning
RSS FeedLLM Hacker News Feb 20, 2026 2 min read
LLM Feb 16, 2026 2 min read
OpenAI reports that, across more than one million ChatGPT conversations, the share of difficult interactions exceeding a human baseline increased roughly fourfold from September 2024 to January 2026. The company also shows large gains in case-interview and puzzle-style open tasks.
LLM Hacker News Feb 14, 2026 1 min read
Google announced a major Gemini 3 Deep Think upgrade with stronger reasoning benchmarks and early API access for researchers and enterprises.
AI Reddit Feb 12, 2026 2 min read
Mathematicians launch a new mathematical proof challenge for AI systems, testing whether AI can not only provide answers but also clearly demonstrate the proof process.
AI Hacker News Feb 12, 2026 1 min read
A new study shows OpenAI's GPT-5 model outperformed federal judges in complex legal reasoning tasks.