Anthropic formalizes disclosure rules for Claude-discovered vulnerabilities
Original: Coordinated vulnerability disclosure for Claude-discovered vulnerabilities View original →
On Mar 6, 2026, Anthropic published a coordinated vulnerability disclosure policy for vulnerabilities discovered with assistance from Claude. The company says every report it sends will be reviewed and confirmed by a human security researcher, and findings that originate from AI-powered discovery will be explicitly labeled as such.
The policy sets a default 90-day disclosure deadline from the initial report to public disclosure. If a maintainer needs more time near the end of that window, Anthropic says it may grant a 14-day extension on request. For actively exploited critical vulnerabilities, the timeline is much shorter: Anthropic targets a patch or mitigation within 7 days and may allow an additional 7 days if the maintainer is actively working on a fix.
Anthropic also lays out how it wants to reduce operational burden on vendors and open-source maintainers. It says it will include candidate fixes where possible, avoid dropping large volumes of findings on a single project without coordination, and escalate to an external vulnerability coordinator if a maintainer does not respond within 30 days. For ecosystem-wide issues that affect many projects, the company says it will notify affected parties and give them time to respond before technical details become public.
Another notable detail is the publication buffer after a patch exists. Anthropic says it will generally wait 45 days before releasing full technical details so downstream users have time to deploy fixes. That delay can be shortened if the information is already public or if earlier publication would materially help defenders respond to ongoing attacks. It can also be extended when remediation is unusually complex or the affected footprint is unusually broad.
The update matters because AI-assisted vulnerability discovery is moving from demonstration to operational workflow. Anthropic is trying to show that frontier-model tooling can fit inside established disclosure norms instead of bypassing them. That makes the policy relevant not only for Claude users, but also for software vendors, open-source maintainers, and security teams deciding how AI should participate in offensive security research and coordinated disclosure.
Related Articles
AI 연구 자동화가 추상적 위험에서 실험 지표로 이동했다. Anthropic은 Mythos Preview가 최적화 과제에서 약 52배 속도 향상을 냈고, 연구 다음 단계 판단에서도 64% 우위를 보였다고 밝혔다.
Claude Code와 Cowork 같은 에이전트가 실제 업무 권한을 얻으면서, 위험의 초점은 모델 설득이 아니라 실행 환경 통제로 이동했다. Anthropic은 사용자 승인 프롬프트의 93%가 그대로 통과된다는 수치를 근거로 샌드박스와 격리를 전면에 세웠다.
Anthropic가 Claude 기반 AI system이 찾아낸 취약점에 대한 coordinated vulnerability disclosure 기준을 공개했다. human review, 공개 시한, maintainer 미응답 시 escalation까지 명시해 coding agent 시대의 보안 운영 원칙을 제도화하려는 움직임이다.