#red-teaming

AI Apr 23, 2026 1 min read

OpenAI、GPT-5.5のバイオ脱獄に懸賞金　初の万能突破へ$25,000

OpenAIが今回お金をかけたのは、もっとも厄介な失敗形だ。GPT-5.5 Bio Bug Bountyは、Codex Desktop上のGPT-5.5でバイオ安全性の5問すべてを一度に破る万能プロンプトに$25,000を設定し、正式テストを4月28日に始める。

#openai #gpt-5.5 #biosecurity

AI Hacker News Apr 13, 2026 1 min read

Hacker Newsで広がったBerkeleyの警告: 主要AI agent benchmarkはscore hackingに弱い

520ポイント、132コメントを集めたHacker Newsのスレッドで、Berkeleyの研究者は8つの主要AI agent benchmarkが実タスクを解かなくてもharnessの弱点で高得点化できると主張した。

#ai-agents #benchmarks #evaluation

LLM Hacker News Apr 8, 2026 1 min read

Hacker Newsが見た Claude Mythos Preview、cybersecurity capability の閾値を押し上げる

Anthropicは2026年4月7日に Claude Mythos Preview の security 評価を公開し、major OS と browser 全体での zero-day 発見と exploit 化能力を強調した。Hacker News では、frontier LLM の進歩が defensive tooling と offensive risk を同時に押し上げる転換点として受け止められている。

#anthropic #cybersecurity #llm

LLM Mar 28, 2026 1 min read

OpenAI、Promptfoo買収でagent security testingをFrontierへ統合

OpenAIはMarch 9, 2026にPromptfoo買収計画を発表した。Promptfooのsecurity testingとevaluation技術をOpenAI Frontierへ統合し、prompt injection、jailbreak、data leak、tool misuseなどのenterprise riskを開発段階から扱えるようにする方針だ。

#openai #promptfoo #ai-security

#red-teaming

OpenAI、GPT-5.5のバイオ脱獄に懸賞金 初の万能突破へ$25,000

Hacker Newsで広がったBerkeleyの警告: 主要AI agent benchmarkはscore hackingに弱い

Hacker Newsが見た Claude Mythos Preview、cybersecurity capability の閾値を押し上げる

OpenAI、Promptfoo買収でagent security testingをFrontierへ統合

OpenAI、GPT-5.5のバイオ脱獄に懸賞金　初の万能突破へ$25,000