Anthropic Launches Claude Opus 4.6, Outperforms GPT-5.2

Read in other languages: 한국어日本語
LLM Feb 13, 2026 By Insights AI 1 min read 9 views Source

Key Features

Anthropic released Claude Opus 4.6 on February 5, introducing adaptive thinking, a 1M token context window in beta, and 128K max output tokens. This release marks the highest agentic coding scores Anthropic has achieved to date.

Benchmark Results

Coding Performance: On Terminal Bench, Opus 4.6 scores 65.4%, up from 59.8% for Opus 4.5, and on the OSWorld agentic computer use benchmark, its score rose from 66.3% to 72.7%.

Long-Context Retrieval: Claude Opus 4.6 scored 76% on a long-context retrieval benchmark where its predecessor managed just 18.5% — a more than 4x improvement.

Knowledge Work: On GDPval-AA, evaluating performance on economically valuable knowledge work tasks in finance, legal, and other domains, Opus 4.6 outperforms OpenAI's GPT-5.2 by around 144 Elo points and its own predecessor Claude Opus 4.5 by 190 points.

Additional Achievements

It achieves the highest score on the agentic coding evaluation Terminal-Bench 2.0 and leads all other frontier models on Humanity's Last Exam, a complex multidisciplinary reasoning test.

Industry Impact

The launch of Opus 4.6 signals a new phase in AI model competition. Its superiority in knowledge work and coding agent performance — critical for enterprise environments — is expected to strengthen Anthropic's position in the enterprise AI market.

Source: Anthropic

Share:

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.