Anthropic sets disclosure rules for vulnerabilities discovered by Claude
Original: Coordinated vulnerability disclosure for Claude-discovered vulnerabilities View original →
Anthropic published a new coordinated vulnerability disclosure framework on March 6, 2026 for vulnerabilities discovered by Claude. The company says AI systems can now find software vulnerabilities faster and more cheaply, which means traditional disclosure practices need clearer operating rules for an era in which machine-assisted discovery can happen at much greater scale. Anthropic says the framework applies to vulnerabilities it discovers in open-source software and in closed-source software where it has appropriate authorization to conduct security research. It does not apply to reports from outside researchers sent to Anthropic, which remain covered by Anthropic’s separate responsible disclosure policy.
The default timeline follows familiar industry practice: Anthropic says it aims to notify vendors and maintainers as soon as possible, then share details publicly with defenders after 90 days or after a patch is released, whichever comes first. But the policy adds several AI-era exceptions. If a maintainer is actively working on a fix as the 90-day deadline approaches, Anthropic says it may grant a 14-day extension. For actively exploited critical vulnerabilities, Anthropic targets a 7-day patch or mitigation timeline and may grant another 7-day extension if the maintainer is engaged. For ecosystem-wide patterns that affect many projects at once, Anthropic says it will try to notify affected parties and give each maintainer time to respond before public disclosure.
The coordination details are just as important as the deadlines. Anthropic says that if a maintainer does not respond within 30 days of the initial report, it will aim to escalate the finding to an external vulnerability coordinator and proceed toward disclosure when the applicable deadline expires. Once a patch is available, Anthropic says it would generally wait 45 days before releasing full technical details so downstream users have time to deploy fixes. The company also says every report it sends is reviewed and confirmed by a human security researcher, and reports that originate from AI-powered discovery are clearly labeled as such. Where Anthropic has source access and its tooling can produce a candidate patch, it says it may include that patch and offer to collaborate on a production-quality fix.
The policy matters because the bottleneck in vulnerability disclosure is not only discovery. It is also maintainers’ ability to triage, validate, and fix what they receive. Anthropic explicitly says it does not want to dump large volumes of findings on a single project without first agreeing on a pace the maintainers can absorb. That is a notable shift in emphasis. As frontier AI systems become better vulnerability researchers, the industry will need operating norms that balance faster discovery with manageable reporting, defender response time, and accountability for AI-generated findings. Anthropic’s framework is an early attempt to define those norms before AI-assisted security research becomes even more operationally intense.
Related Articles
Anthropic published a March 6, 2026 case study showing how Claude Opus 4.6 authored a working test exploit for Firefox vulnerability CVE-2026-2796. The company presents the result as an early warning about advancing model cyber capabilities, not as proof of reliable real-world offensive automation.
Claude Code Security, announced February 20, uses AI reasoning to scan codebases for vulnerabilities and found 500+ undetected bugs in production open-source code. Cybersecurity stocks fell sharply on the news.
Anthropic put Claude Code Security into limited research preview for Enterprise and Team customers. The tool reasons over whole codebases, ranks severity and confidence, and proposes patches for human review.
Comments (0)
No comments yet. Be the first to comment!