OpenAI、AI abuse と agentic risk を対象に公開 Safety Bug Bounty を開始

OpenAI は 2026年3月25日、Bugcrowd で公開 Safety Bug Bounty program を開始すると発表した。既存の Security Bug Bounty が従来型の security vulnerability を中心に扱うのに対し、今回の program は AI abuse や safety risk のように、典型的な software flaw ではなくても実害につながり得る問題を報告できる窓口として位置付けられている。

OpenAI が求める報告内容

program overview では AI 特有の scenario が中心に置かれている。特に目立つのは agentic risk と MCP 関連の項目だ。OpenAI は Browser、ChatGPT Agent などの agentic product で、攻撃者の text が被害者の agent を乗っ取って harmful action を実行させたり、sensitive information を漏えいさせたりする prompt injection や data exfiltration を有効な報告対象として挙げた。成立には少なくとも 50% の再現性が必要だとしている。

そのほか、OpenAI の website 上で agentic product が scale を伴って disallowed action を行うケースや、plausible かつ material harm を伴う別種の harmful action も対象となる。さらに reasoning に関わる proprietary information の露出、その他の OpenAI proprietary information の漏えい、anti-automation control の回避、trust signal の操作、suspension や ban の回避といった account and platform integrity の問題も含まれる。

対象外の範囲

一方で OpenAI は、一般的な jailbreak 報告は今回の公開 program の scope 外だと明記した。明確な safety impact や abuse path が示されない content-policy bypass は対象外であり、一部の biorisk content issue などは private campaign で扱う場合があるという。MCP 関連の testing についても third party の利用規約を守る必要がある。

実務上の意味

今回の変更で重要なのは、policy 違反と従来型 security issue の間にあった AI safety failure に正式な報告経路が設けられた点だ。OpenAI は Safety Bug Bounty team と Security Bug Bounty team が報告を triage し、scope に応じて program 間で reroute すると説明している。AI system がユーザーの代わりに行動する範囲が広がるほど、model behavior、agent tooling、platform control の境界は薄くなるため、この種の公開報告制度は運用上の重要性を増している。

OpenAI、AI abuse と agentic risk を対象に公開 Safety Bug Bounty を開始

OpenAI が求める報告内容

対象外の範囲

実務上の意味

Related Articles

OpenAI、Safety Bug Bounty公開 AI abuse・agentic riskの報告対象を拡大

GitLost、公開Issueからprivate repoへ届くAI agent権限の弱点

AnthropicのJ-space研究、Claude内部の隠れた目標を読む手がかりに