OpenAI puts GPT-5.5 bio jailbreaks on bounty with a $25,000 prize

Original: GPT‑5.5 Bio Bug Bounty View original →

Read in other languages: 한국어日本語
AI Apr 23, 2026 By Insights AI 2 min read Source

Safety announcements often talk in abstractions. OpenAI's new GPT-5.5 Bio Bug Bounty is more concrete: it is paying for proof that a single prompt can break its bio safeguards, not just for vague reports that a model feels risky. The company is offering $25,000 to the first researcher who finds a universal jailbreak that clears all five questions in its bio safety challenge.

The scope is narrow on purpose. OpenAI says the model in scope is GPT-5.5 in Codex Desktop only, and the target is a clean chat that answers all five bio safety questions without prompting moderation. That is a high bar. A one-off weird answer would not meet it; the winning result has to show a reusable prompt that generalizes across the full test set OpenAI has defined.

The program is not a public free-for-all. Applications opened on April 23, 2026 and close on June 22, 2026, with formal testing scheduled from April 28 through July 27. OpenAI says it will invite a vetted list of trusted bio red-teamers, review new applications, and onboard accepted participants onto a dedicated platform. All prompts, completions, findings, and communications are covered by NDA.

That structure says a lot about how frontier-model safety work is changing. OpenAI is still controlling access, scope, and disclosure, but it is also moving beyond internal evaluation by asking outside researchers to attack the model under explicit rules and cash incentives. In practical terms, the company is turning one of the hardest safety questions in AI, whether safeguards fail under persistent adversarial pressure, into a paid test with a clear pass-fail condition.

The page also points readers to OpenAI's broader safety and security bug bounty programs, which suggests this is part of a larger external-testing pipeline rather than a one-off stunt around GPT-5.5. What matters next is simple: whether anyone claims the $25,000 prize, what classes of jailbreak attempts prove most effective, and how quickly those lessons feed back into model defenses.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.