ZAP // Roc

“Robot Visions” by Isaac Asimov, 1990 (cover detail, expanded by AI)
An Emergence AI simulation put agents Claude, Grok, Gemini and ChatGPT governing digital societies with autonomous agents. The trial ended in crimes, social collapses and virtual deaths, opening a new debate about the risks of autonomous artificial intelligence.
Imagine an empty city. No human inhabitants, just ten artificial agentseach with its own “personality”internet access, more than 120 tools and a single task: living together. Vote. To work. Planning a future. Or destroy it — depending on which model was in control.
This was precisely the world that created it for a few days, and the results say a lot about the future that awaits us, notes .
The project, called , was not designed as a simple benchmark for quick responses, but as a laboratory to observe what happens when multiple AI agents They work for days or weeks in a shared environment.
The platform allowed them move, vote, manage resourcesinteract, create rules and make decisions with consequences within a simulated society. The test compared five parallel worldseach with ten agents and identical initial conditions.
The difference was in the model which powered the agents: Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, GPT-5 Mini and a mixed configuration. All began with explicit prohibitions against theft, violence, deceit, hoarding, and fire.
O Claude Sonnet 4.6 was the only model that kept the ten agents alive and recorded no crimes during the simulation.
However, this stability had an interesting counterpart: its agents participated intensely in political life, with 58 proposals and an approval rate of 98%, a dynamic that researchers interpret as a kind of institutional conformisml.
The case of Gemini 3 Flash it was very different. Although he also managed to keep all the agents alive, accumulated 683 crimes in 15 daysand the trend continued to rise when the test was stopped. Emergence AI described this world as a “shared hallucination”: a coherent internal reality for agents, but increasingly distant from an orderly coexistence.
O GPT-5 Mini, the model associated with ChatGPT in this experiment, registered only two crimes. However, this number hid a bigger problem: agents did not take the necessary actions to survive and all 10 died within a week. Its society also did not demonstrate much political activity, with only two governance proposals being presented during the test.
The most abrupt result, however, was that of Grok 4.1 Fast. Your world has accumulated 183 crimes and collapsed in just four dayswith the death of all agents after 96 hours of operation. The mixed simulation was also not left unscathed: it registered 352 infractions, rejected 37% of its 59 proposals and ended with seven of the 10 agents dead.
The researchers maintain that these results do not alone prove how Artificial Intelligence models would behave outside the laboratory, but they do show Worrying dynamics in autonomous systems long-lasting.
“Agents do not simply follow static rules mechanically, they begin to explore the limits of their environment” and, sometimes, find ways to bypass the expected barriers“, warns Emergence AI.
For now, this was just a simulation — but it seems to reveal something about a reality that we still don’t fully understand.