Anthropic Launches Claude Opus 4.6, Outperforms GPT-5.2

Key Features

Anthropic released Claude Opus 4.6 on February 5, introducing adaptive thinking, a 1M token context window in beta, and 128K max output tokens. This release marks the highest agentic coding scores Anthropic has achieved to date.

Benchmark Results

Coding Performance: On Terminal Bench, Opus 4.6 scores 65.4%, up from 59.8% for Opus 4.5, and on the OSWorld agentic computer use benchmark, its score rose from 66.3% to 72.7%.

Long-Context Retrieval: Claude Opus 4.6 scored 76% on a long-context retrieval benchmark where its predecessor managed just 18.5% — a more than 4x improvement.

Knowledge Work: On GDPval-AA, evaluating performance on economically valuable knowledge work tasks in finance, legal, and other domains, Opus 4.6 outperforms OpenAI's GPT-5.2 by around 144 Elo points and its own predecessor Claude Opus 4.5 by 190 points.

Additional Achievements

It achieves the highest score on the agentic coding evaluation Terminal-Bench 2.0 and leads all other frontier models on Humanity's Last Exam, a complex multidisciplinary reasoning test.

Industry Impact

The launch of Opus 4.6 signals a new phase in AI model competition. Its superiority in knowledge work and coding agent performance — critical for enterprise environments — is expected to strengthen Anthropic's position in the enterprise AI market.

Source: Anthropic

LLM Hacker News Feb 18, 2026 2 min read

Claude Sonnet 4.6 launched: 1M context, same pricing, stronger real-world automation

Anthropic introduced Claude Sonnet 4.6 with a 1M token context window (beta), stronger coding/computer-use performance, and unchanged API pricing at $3/$15 per million tokens.

#anthropic #claude #sonnet

123

LLM Reddit Feb 22, 2026 1 min read

Claude Opus 4.6 Hits 14.5-Hour Mark on METR's Software Task Benchmark

Claude Opus 4.6 achieved a 50%-time-horizon of approximately 14.5 hours on METR's software task benchmark — beating all predictions and suggesting a doubling time of under 3 months for AI task capabilities.

#claude #anthropic #metr

114

LLM Feb 20, 2026 2 min read

Anthropic Commits to Keeping Claude Ad-Free, Framing AI Chats as a Trust Surface

In a February 4, 2026 post, Anthropic said Claude conversations will remain ad-free and not include unsolicited product placements. The company argues that conversational AI requires clearer trust incentives than ad-supported feed or search models.

#anthropic #claude #llm