#reasoning

LLM Hacker News Feb 20, 2026 2 min read

Gemini 3.1 Pro Launches as Google Targets Complex Reasoning Work

A top Hacker News discussion tracked Google’s Gemini 3.1 Pro rollout. Google positions it as a stronger reasoning baseline, highlighting a 77.1% ARC-AGI-2 score and broad preview availability across developer, enterprise, and consumer channels.

#gemini #google #llm

LLM Feb 16, 2026 2 min read

OpenAI: High-Difficulty ChatGPT Reasoning Interactions Rose 4x in 16 Months

OpenAI reports that, across more than one million ChatGPT conversations, the share of difficult interactions exceeding a human baseline increased roughly fourfold from September 2024 to January 2026. The company also shows large gains in case-interview and puzzle-style open tasks.

#openai #chatgpt #reasoning

LLM Hacker News Feb 14, 2026 1 min read

Gemini 3 Deep Think Expands From Benchmarks to Science and Engineering Workflows

Google announced a major Gemini 3 Deep Think upgrade with stronger reasoning benchmarks and early API access for researchers and enterprises.

#gemini #google #reasoning

AI Reddit Feb 12, 2026 2 min read

Mathematicians Issue a Major Challenge to AI—Show Us Your Work

Mathematicians launch a new mathematical proof challenge for AI systems, testing whether AI can not only provide answers but also clearly demonstrate the proof process.

#mathematics #ai #reasoning

AI Hacker News Feb 12, 2026 1 min read

GPT-5 Outperforms Federal Judges in Legal Reasoning Experiment

A new study shows OpenAI's GPT-5 model outperformed federal judges in complex legal reasoning tasks.

#gpt-5 #openai #legal-ai