Introducing the OpenAI Safety Bug Bounty program

Original: Introducing the OpenAI Safety Bug Bounty program View original →

Read in other languages: 한국어日本語
AI Apr 12, 2026 By Insights AI 2 min read 3 views Source

OpenAI said on March 25, 2026 that it is launching a public Safety Bug Bounty program on Bugcrowd to collect reports about AI abuse and safety risks across its products. The company frames the program as a complement to its existing Security Bug Bounty, with a focus on harmful behavior that may not fit the classic definition of a software security flaw but could still lead to tangible harm.

What OpenAI wants reported

According to the program overview, the new bounty covers AI-specific scenarios. One major category is agentic risk, including MCP-related testing. OpenAI says valid reports can include prompt injection or data exfiltration when attacker-controlled text can reliably hijack a victim's agent, including Browser, ChatGPT Agent, and similar products, to trigger a harmful action or leak sensitive information. OpenAI says the behavior must be reproducible at least 50% of the time.

The company also lists cases where an OpenAI agentic product performs a disallowed action on OpenAI's own website at scale, or performs another potentially harmful action with plausible and material harm. Additional in-scope areas include exposure of proprietary information related to reasoning, other OpenAI proprietary information, and account or platform integrity failures such as bypassing anti-automation controls, manipulating trust signals, or evading suspensions and bans.

What stays out of scope

OpenAI draws a boundary around general jailbreak reports. It says generic content-policy bypasses without a demonstrable safety or abuse impact are out of scope, and it gives examples of areas that may instead be handled through private campaigns, including some biorisk content issues in ChatGPT Agent and GPT-5. Any MCP-related testing must also comply with the terms of service of third parties involved.

Why this matters

The practical shift is that researchers now have a formal reporting path for safety and abuse failures that sit between policy enforcement and traditional security work. OpenAI says submissions will be triaged by its Safety and Security Bug Bounty teams and may be rerouted between the two programs depending on scope. That structure suggests the company expects growing overlap between model behavior, agent tooling, and platform controls as AI systems take more actions on behalf of users.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.