What do you do when artificial intelligence hallucinates your money?

Imagine you ask an AI agent to convert $10,000 to Canadian dollars by the end of the day. The agent performs — poorly. You misinterpret the parameters, make an unauthorized leveraged bet and your capital evaporates. Who is responsible? Who returns your money?

Currently, no one needs to return it. And that, a group of researchers argues, is the defining vulnerability of the era of agentive AI.

Also read:

Continues after advertising

In a paper published April 8, researchers from Microsoft Research, Columbia University, Google DeepMind, Virtuals Protocol, and AI startup T54 Labs proposed a comprehensive new financial protection framework called the Agentic Risk Standard (ARS), designed to do for AI agents what custody, insurance, and clearinghouses do for traditional financial transactions. The standard is open source and available on GitHub via T54 Labs.

We’re talking about an entire “agent economy” here, T54 founder Chandler Fang told Fortune in an email statement; “It’s very different from simply using AI agents for financial tasks.”

He stated that there are two fundamental types of agentive transactions: human-in-the-loop financial transactions and autonomous agent transactions.

Everyone is focused on the human in the loop, he said, and that’s a real problem because the financial ecosystem currently has no other way to operate than to shift all responsibility back to a human. It all comes down to the probabilistic nature of this technology, the researchers explained.

The probabilistic problem

The central problem the team identifies is what it calls the “assurance gap,” defined as a “disconnect between the probabilistic reliability that AI security techniques offer and the enforceable guarantees that users need before delegating high-risk tasks.”

This description hearkens back to what leadership expert Jason Wild previously told Fortune about how AI tools are probabilistic, confusing managers everywhere.

Continues after advertising

“Without a way to limit potential losses,” the T54 team wrote, “users rationally limit delegation to AI to low-risk tasks, restricting broader adoption of agent-based services.”

Model-level security improvements, they argue, can reduce the likelihood of an AI failing, but not eliminate it.

Large-scale language models are inherently stochastic (evolve based on probabilities), which means that no matter how well trained or tuned an AI agent is, it can still hallucinate and make mistakes.

Continues after advertising

When that agent is connected to your brokerage account or executing financial API calls, even a single failure can generate an immediate and concrete loss.

“Most research on trustworthy AI seeks to reduce the probability of failure,” said Wenyue Hua, senior researcher at Microsoft Research.

“This work is essential, but probability is no guarantee. ARS takes a complementary approach: rather than trying to make the model perfect, we formalize what happens financially when it isn’t. The result is a settlement protocol in which user protection is deterministic, not probabilistic.”

Continues after advertising

The researchers’ solution draws direct inspiration from centuries of financial engineering. ARS introduces a tiered settlement structure: custodial accounts that hold service fees and release them only upon verified task delivery; collateral requirements that AI service providers must deposit before accessing users’ funds; and optional underwriting — a third party that assumes risk, prices the danger of an AI failure, charges a premium, and commits to reimbursing the user if something goes wrong.

The framework distinguishes between two types of AI jobs: Standard service tasks — such as generating a presentation or writing a report — have limited financial exposure, so custody-based settlement is sufficient.

Tasks that involve moving resources — currency trading, leveraged positions, financial API calls — require the agent to access user capital before results can be verified, and this is where underwriting becomes essential.

Continues after advertising

It is the same logic that governs derivatives markets, where clearing houses are between the parties so that a single default does not generate a cascading effect.

The article maps the ARS explicitly against existing risk allocation sectors in a table: construction uses performance guarantees; e-commerce uses platform custody; financial markets use margin requirements and clearinghouses; and DeFi (decentralized finance) uses collateralization by smart contracts.

AI agents, the researchers argue, are simply the next category of high-risk services that need their own version of this infrastructure.

The moment is crucial

Financial regulators are already mobilizing. Finra’s 2026 regulatory oversight report, released in December, included for the first time a section on generative AI, warning brokers to develop procedures specifically aimed at hallucinations and to screen AI agents that may act “beyond the user’s actual or intended scope and authority.” The SEC and other agencies are watching closely.

But ARS is presented as something regulators haven’t yet built: not a set of rules, but a protocol — a standardized machine that governs how funds are blocked, how claims are filed, and how refunds are triggered when an AI agent fails.

The researchers recognize that the ARS is one layer of a larger stack of trust and that the real bottleneck will be building accurate risk pricing models for agentive behavior.

“This paper is the first step toward establishing a high-level framework that captures the end-to-end process associated with autonomous agent transactions and what risk assessment looks like,” Fang told Fortune. “Down the road, we should introduce more specific details, models, and other research to understand how we assess risk in different use cases.”

2026 Fortune Media IP Limited

Source link