Anthropic Accuses DeepSeek and Chinese AI Firms of Stealing 16M Claude Training Exchanges
Original: Anthropic is accusing DeepSeek, Moonshot AI (Kimi) and MiniMax of setting up more than 24,000 fraudulent Claude accounts, and distilling training information from 16 million exchanges. View original →
Industrial-Scale AI Data Theft Allegations
Anthropic has leveled explosive accusations against three Chinese AI companies: DeepSeek, Moonshot AI (Kimi), and MiniMax. According to a Wall Street Journal report, the companies allegedly set up more than 24,000 fraudulent Claude accounts and used them to harvest training data from approximately 16 million conversation exchanges.
What Is Distillation and Why It Matters
Distillation in AI refers to the process of using outputs from a large language model (LLM) to train smaller or competing models. Anthropic's terms of service explicitly prohibit using Claude's outputs to train competing AI systems. When done at this scale without authorization, it constitutes both a terms-of-service violation and a potential intellectual property infringement.
The Scope of the Alleged Operation
What sets this case apart is its scale. Creating over 24,000 fake accounts and systematically harvesting 16 million conversations suggests a coordinated, industrial-scale operation rather than casual experimentation. Anthropic described this as an "industrial-scale distillation attack" in a company blog post, noting they had identified and blocked the operation.
Context: China's AI Surge
DeepSeek made waves in late 2024 and early 2025 by claiming GPT-4-level performance at a fraction of the cost. These allegations raise questions about whether some of that performance was achieved through unauthorized data acquisition. Moonshot AI (Kimi) and MiniMax are also fast-growing Chinese AI startups that have rapidly closed the gap with Western models.
Implications for the AI Industry
This case could set important precedents around AI output ownership, the legality of distillation from commercial services, and competitive practices in the global AI race. Anthropic has published evidence of the attack and is calling for industry-wide attention to model output protection. The outcome may shape future policies on AI data rights and cross-border IP enforcement.
Related Articles
Why it matters: open models rarely arrive with both giant context claims and deployable model splits. DeepSeek put hard numbers on the release with a 1M-context design, a 1.6T/49B Pro model, and a 284B/13B Flash variant.
Axios reports the NSA is using Anthropic's Mythos Preview even as Pentagon officials call the company a supply-chain risk. The clash puts AI safety limits, federal cyber demand, and procurement politics in the same room.
Why it matters: the same model Anthropic framed as too dangerous for public release was reportedly exposed twice in quick succession. The Verge says Mythos was first revealed through an unsecured data trove, then reached by unauthorized users from day one through guessed infrastructure and contractor access.
Comments (0)
No comments yet. Be the first to comment!