Anthropic details large-scale distillation attacks against Claude

Anthropic said on February 23, 2026 that it detected industrial-scale efforts to extract Claude's capabilities through distillation attacks. In the post, the company named DeepSeek, Moonshot, and MiniMax, and said the campaigns generated more than 16 million exchanges with Claude through roughly 24,000 fraudulent accounts in violation of Anthropic's terms of service and regional access rules.

The company drew a sharp distinction between ordinary distillation and the behavior it says it observed. Distillation itself is a standard technique for training smaller or cheaper models from stronger ones, including within the same lab. Anthropic's allegation is that competitors used fraudulent access and repeated high-volume prompting to transfer Claude's capabilities into their own systems instead of developing them independently.

Anthropic said the campaigns relied on proxy services and what it called hydra cluster architectures: large networks of accounts that spread traffic across Anthropic's API and third-party cloud platforms. One proxy network, according to the company, managed more than 20,000 fraudulent accounts at the same time. Anthropic also said one targeted campaign pivoted within 24 hours of a new model release, suggesting that the operators were closely tracking changes in Claude's capabilities.

The security argument goes beyond commercial competition. Anthropic said illicit distillation can strip away safety behavior and reduce the visibility other labs have into how powerful model capabilities spread, especially in areas such as cyber misuse or bioweapon-related knowledge. The company also argued that these campaigns complicate debates around export controls because apparent capability gains may partly reflect extraction from existing American frontier models rather than entirely independent research progress.

To respond, Anthropic said it has built classifiers and behavioral fingerprinting systems to detect distillation patterns, including chain-of-thought elicitation, and that it is sharing technical indicators with other AI labs, cloud providers, and relevant authorities. Because the post is Anthropic's own account, its claims should be understood as company allegations rather than an independent adjudication. Even so, the disclosure is one of the clearest public looks yet at how model extraction has become a frontline security issue for frontier AI providers.

Anthropic details large-scale distillation attacks against Claude

Related Articles

Claude Security enters beta with adversarial scan-to-patch flow

Anthropic Opens Claude Security Public Beta for Automated Code Vulnerability Scanning

Claude Launches 10 Ready-to-Run Finance Agents: From Pitchbooks to KYC Screening

Comments (0)

Leave a Comment