Project Glasswing: How Anthropic's Mythos AI Chains Vulnerabilities into Working Exploits
Original: Project Glasswing: what Mythos showed us View original →
What Is Project Glasswing?
Project Glasswing is Anthropic's controlled research program providing select organizations access to Mythos Preview — a security-specialized LLM distinct from general-purpose frontier models. Cloudflare participated and tested the model against their own infrastructure, publishing the full results on their blog.
What Mythos Can Do
Exploit chain construction: Mythos can take multiple low-severity vulnerability primitives and chain them into a single, more severe working exploit. This is senior security researcher reasoning, not automated scanning.
Proof generation: Rather than generating speculative findings, Mythos writes, compiles, and executes code to verify vulnerabilities. When initial hypotheses fail, it iterates independently.
Cloudflare's 8-Stage Architecture
- Recon: Architecture mapping and initial task queue
- Hunt: ~50 concurrent agents targeting specific attack classes
- Validate: Independent adversarial review to filter false positives
- Gapfill: Re-queues under-explored areas
- Dedupe: Collapses duplicate findings
- Trace: Cross-repo exploitability analysis
- Feedback: Loops validated findings back into hunting
- Report: Structured output
Limitations and Dual-Use Warning
Despite lacking standard guardrails, Mythos exhibited unpredictable refusals on legitimate security tasks — identical requests produced different outcomes across runs. Cloudflare is explicit: these capabilities will eventually reach attackers. Their recommendation shifts emphasis from fast patching to defensive architecture — separating security boundaries, implementing blocking infrastructure, and coordinating simultaneous global deployments.
Related Articles
Researchers from Calif teamed with Anthropic's Mythos Preview to develop the first public macOS kernel memory corruption exploit bypassing Apple M5's Memory Integrity Enforcement — in just five days. Apple spent five years building what they broke in a week.
Why it matters: the same model Anthropic framed as too dangerous for public release was reportedly exposed twice in quick succession. The Verge says Mythos was first revealed through an unsecured data trove, then reached by unauthorized users from day one through guessed infrastructure and contractor access.
Why it matters: AI security tools only matter if teams trust the findings enough to act. Anthropic put Opus 4.7 behind a beta workflow that scans code, validates issues, and suggests fixes after a preview used by hundreds of organizations.
Comments (0)
No comments yet. Be the first to comment!