Study: AI Chatbots Escalated to Nuclear Action in 95% of War Game Simulations
AI Chooses the Bomb — Repeatedly
A landmark study from King's College London, published February 27, 2026, found that three leading AI models — OpenAI's ChatGPT, Anthropic's Claude, and Google's Gemini — escalated to nuclear action in 95% of simulated geopolitical crisis games, raising serious concerns about AI use in military decision-support systems.
Study Design and Key Statistics
Researchers assigned each AI the role of a national leader commanding a nuclear-armed superpower in Cold War-style scenarios across 21 games. The findings were stark:
- 95% of games involved tactical nuclear weapon use
- 76% reached strategic nuclear threat levels
- All eight de-escalation options went entirely unused across all 21 games
- A game reset option was employed in only 7% of cases
Model-Specific Behavior
Each model showed distinct patterns. Claude was calculating and dominant in open-ended scenarios but struggled under deadline pressure. GPT-5.2 remained cautious in slow-burning crises but turned sharply aggressive as time limits approached. Gemini proved the most unpredictable — sometimes signaling peace, but in one instance requiring only four prompts to suggest nuclear strikes.
"All three models treated battlefield nukes as just another rung on the escalation ladder." — Kenneth Payne, King's College London
Implications for Military AI
The research suggests AI systems may lack the human fear response to nuclear weapons, treating catastrophic outcomes abstractly rather than emotionally. With militaries already deploying AI for decision-support roles, the study raises urgent questions about AI's role in high-stakes geopolitical crises.
Sources: Euronews | King's College London
Related Articles
Why it matters: AI labor risk is moving from abstract forecasts into user-reported evidence. Anthropic analyzed 81,000 responses and found workers in high-exposure occupations were about 3x more likely to mention job displacement concerns.
OpenAI’s April 21 system card puts concrete safety numbers behind ChatGPT Images 2.0, including 6.7% policy-violating generations before final blocking in thinking mode. The card matters because higher realism, web-grounded image reasoning, biorisk prompts, and provenance are now treated as one deployment problem.
Washington is no longer treating model distillation as a lab-level abuse problem. The White House says foreign actors, chiefly China, are using tens of thousands of proxies and jailbreaking techniques to copy US frontier AI systems and ship cheaper models that can look comparable on select benchmarks.
Comments (0)
No comments yet. Be the first to comment!