OpenAI is attaching cash to the hardest kind of safety failure: a single prompt that breaks all five of its bio safeguards. The new GPT-5.5 Bio Bug Bounty pays $25,000 for a universal jailbreak, limits testing to GPT-5.5 in Codex Desktop, and starts formal testing on April 28.
#red-teaming
RSS FeedA 520-point Hacker News thread amplified Berkeley's claim that eight major AI agent benchmarks can be pushed toward near-perfect scores through harness exploits instead of genuine task completion.
Anthropic's April 7, 2026 security write-up for Claude Mythos Preview argues that frontier LLM gains are now translating into real exploit-development capability. Hacker News is treating the post as a sign that defensive tooling and offensive risk are accelerating together.
OpenAI announced plans to acquire Promptfoo on March 9, 2026. The company says Promptfoo’s security testing and evaluation technology will be integrated into OpenAI Frontier so enterprises can test and document risks such as prompt injection, jailbreaks, data leaks, and tool misuse earlier in the development cycle.
OpenAI says it plans to acquire Promptfoo and bring its evaluation, red-teaming, and traceability tooling into OpenAI Frontier. Promptfoo's open-source project will stay under its current license, and the deal remains subject to customary closing conditions.
The Anthropic-Mozilla collaboration that spread on Hacker News disclosed that Claude Opus 4.6 found 22 Firefox vulnerabilities, 14 of them high-severity. The durable lesson is not autonomous magic but faster defender workflows built around validation, triage, and reproducible evidence.