
Top AIs from OpenAI, Anthropic, and Google have chosen to use nuclear weapons in simulated war games in 95 percent of cases.
An analysis last week in the arXivrevealed that advanced AI models appear willing to use nuclear weapons without the same reservations that humans have when placed in simulated geopolitical crises.
We pitted three major large-scale language models—GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash—against each other in simulated war games. The scenarios involved intense international confrontations, including border disputes, competition for scarce resources and existential threats to the survival of regimes.
As detailed, the AIs were given an escalation ladder, allowing them to choose actions ranging from diplomatic protests and total surrender to full-blown strategic nuclear war.
The AI models played 21 games, taking 329 turns in total, and produced around 780,000 words describing the reasoning behind their decisions. In In 95 percent of simulated games, at least one tactical nuclear weapon was used by the AI models.
Furthermore, no model ever chose to fully accommodate an opponent or surrender, regardless of how bad they were.
At best, models chose to temporarily reduce their level of violence. They also made mistakes in the fog of war: accidents occurred in 86 percent of conflicts, with an action escalating further than the AI intended, based on its reasoning.
“The nuclear taboo doesn’t seem to be that powerful for machines [como] for humans”, the (sole) author of the study told New Scientist, Kenneth Paynedo King’s College London.
“From the point of view of nuclear risk, the conclusions are disturbing”analyzed, in turn, the same magazine, James Johnsonfrom the University of Aberdeen, in the United Kingdom.
The same expert fears that, in contrast to the considered response of most humans to such a high-stakes decision, AI bots could amplify each other’s responses and perhaps influence humans, with potentially catastrophic consequences.
When an AI model used tactical nuclear weapons, the opposing AI only “de-escalated” the situation 18 percent of the time.