Anthropic sets disclosure rules for vulnerabilities discovered by Claude
Original: Coordinated vulnerability disclosure for Claude-discovered vulnerabilities View original →
Anthropic’s new operating rules
On March 6, 2026, Anthropic published a coordinated vulnerability disclosure approach for vulnerabilities discovered with the help of its AI systems, including Claude-based workflows. The company said the process applies to open-source projects and to closed-source software where testing is authorized.
The document matters because AI-assisted coding and analysis systems are increasingly capable of surfacing real software flaws. Anthropic is trying to set rules before that capability scales further. Every report is supposed to be reviewed and confirmed by a human before disclosure, and Anthropic says it will clearly label when a vulnerability or a candidate patch was discovered with AI assistance.
The timelines Anthropic chose
Anthropic’s default disclosure period is 90 days. If a maintainer is actively engaged and requests more time, the company says it may offer a 14-day extension. For actively exploited critical vulnerabilities, Anthropic says it targets disclosure in 7 days, with a possible additional 7-day extension if the maintainer is actively working on a fix.
If a maintainer does not respond within 30 days, Anthropic says it will typically escalate to an external coordinator and still aim to follow the original disclosure timeline. Once a patch is available, Anthropic generally plans to wait 45 days before publishing full technical details so downstream users have time to deploy fixes.
- Human review: every report is validated before submission.
- Provenance labeling: Anthropic says it will identify AI-discovered findings and AI-generated patch suggestions.
- Submission pacing: the company says it will not overwhelm a single project with large report volumes without agreement.
The larger importance is strategic. As coding agents become good enough to find vulnerabilities at scale, the industry needs norms around speed, evidence, patch coordination, and downstream risk. Anthropic is effectively arguing that frontier AI labs should behave more like disciplined security researchers, not just ship models and leave ecosystem fallout to maintainers.
Related Articles
Anthropic published a coordinated vulnerability disclosure framework on March 6, 2026 for vulnerabilities discovered by Claude. The policy sets a default 90-day disclosure path, a compressed 7-day path for actively exploited critical bugs, and a 45-day buffer after patches before technical details are usually published.
Anthropic published a March 6, 2026 case study showing how Claude Opus 4.6 authored a working test exploit for Firefox vulnerability CVE-2026-2796. The company presents the result as an early warning about advancing model cyber capabilities, not as proof of reliable real-world offensive automation.
Anthropic published a Mar 6, 2026 policy for vulnerabilities identified with Claude. The framework sets a 90-day default disclosure window, a 7-day target for actively exploited critical bugs, and human review requirements before reports go out.
Comments (0)
No comments yet. Be the first to comment!