AI governance

Bound the Risk Before You Deploy: How to Tier Agentic AI by Reversibility, Not Hype

Treat an AI agent as a capable junior employee under position limits. Score every action by what an undo costs, automate the reversible majority, and gate the rest. That is the cheapest insurance an SME can buy.

By Siddharth Surana, Founder & CEO / 2026-06-23 / 8 min read

A delicate deep-green and parchment emblem: a slender balance scale whose two pans hold a small undo-arrow and a lock, framed by four faint tier rings, uncluttered and elegant.

Bound the risk before you deploy an agent, and bound it by reversibility, not by how clever the agent feels. The single most useful move when putting an agent into production is to stop treating it as software that occasionally errs and start treating it as a capable junior employee under position limits. A junior reads the file, drafts the quote, formats the reply. A junior never signs a binding contract or moves money without a manager's signature. The confirm step is not a software limitation. It is ordinary corporate governance applied to a synthetic employee, the same four-eyes principle and position limits a trading desk uses, where nobody trusts the algorithm. They trust the circuit breakers. This is the posture the IMDA Model AI Governance Framework for Agentic AI puts first as Dimension 1: assess and bound the risks upfront. That framework is voluntary guidance and a living document, not law and not industry-specific. Its first instruction is still the one worth building around.

Risk is likelihood times impact, and impact is mostly reversibility

The IMDA framework defines agent risk as likelihood times impact, then names the factors for each. Impact factors are the domain's error tolerance, access to sensitive data, access to external systems, the scope of an action (read versus write), and reversibility. Likelihood factors are the autonomy level, task complexity, external exposure, whether the system is provided or operated by an external party, and system complexity such as multi-agent setups.

Of those impact factors, reversibility is the one that should drive your entire deployment map, and the field data explains why. Anthropic's Measuring agent autonomy in practice (18 February 2026) found that across real agent usage, 80 percent of tool calls carry at least one safeguard, 73 percent have a human in the loop, and only 0.8 percent of actions are actually irreversible, such as sending an email to a customer. Tier by how autonomous an agent feels and every task looks risky. Tier by what cannot be undone and the genuinely dangerous slice turns out to be tiny and nameable, while the overwhelming majority of work is safe to automate freely. The hype frames the question as how much do we trust the AI. The institutional-grade frame is how much does an undo cost. Those two questions produce very different deployment plans.

The autonomy spectrum, and the control each level requires

The IMDA framework sets out four levels of human involvement. The job is to match the level to the action's risk score, then enforce that match structurally. This is the required mapping.

Human-involvement level	When to use it	Example action	Control required
Agent proposes, human operates	High impact or low error tolerance, early rollout, untrusted data	Agent drafts a customer refund decision; human executes it	Human approves every action before it fires
Agent and human collaborate	Medium risk with some write access; human can intervene anytime	Agent prepares a database write or a payment, pauses at the significant step	Approval at significant steps, human can interrupt at any point
Agent operates, human approves	Mostly reversible work with a few critical, high-stakes steps	Agent runs routine triage but stops before deleting a database or a payment above a set amount	Approval only at critical or above-threshold steps
Agent operates, human observes	Fully reversible, low-impact, well-tested tasks	Agent sends a status reply or resets a password	No pre-approval; immutable log audited after the fact

The same impact and likelihood factors that score the action also tell you which row it belongs in. A short reference of IMDA's Dimension 1 factors:

Impact factors	Likelihood factors
Reversibility (can the action be undone)	Autonomy level
Access to sensitive data	Task complexity
Access to external systems	External exposure
Scope: read versus write	Provided or operated by an external party
Domain error tolerance	System complexity (multi-agent)

The Dayos worked example: tier the tickets, automate the reversible

The IMDA document includes Dayos as a real case study. Dayos tiered its IT tickets by risk and acted on the tiers. Roughly 60 percent of tickets were reversible enough to fully automate (Tier 1). About 30 percent kept a human approval (Tier 2). The remaining 10 percent stayed manual (Tier 3). On the strength of that, Dayos replaced its ServiceNow instance and reduced legacy licensing cost by 121,000 dollars annually.

The number that matters is not the 121,000 dollars. It is the tier mix that made the cut safe. The return-on-investment lever is the reversible 60 percent, not the agent. Most owners try to automate the impressive 10 percent first, the irreversible high-stakes work, because that is where the pain feels biggest. That is exactly the wrong order. Automate the boring reversible majority first: password resets, status replies, draft quotes, routine triage. Reversible work compounds value with near-zero downside, and it builds the audit history that earns the trust to widen the boundary later. The confirm step on the middle tier is not a friction tax. It is what lets you book the Tier 1 savings without owning Tier 3 liability.

How to score and bound it: structural gates, not prompt suggestions

For the technical lead, the rule is to score every action, not the agent, against the Dimension 1 impact factors, then bind structurally rather than in the prompt. A prompt that says always ask before sending money is a suggestion the model can hallucinate around. A permission check at the tool-execution layer cannot be talked out of. The IMDA framework's Tencent CodeBuddy case is the clean template: read requires no approval, but edit, bash, webfetch, and MCP calls all require it, enforced at the tool boundary.

Critically, MCP solves tool discovery, not tool safety. A whitelisted MCP server still happily executes a hallucinated parameter, so the bound has to sit at execution, with least-privilege tools and, in our own builds, column-level RLS or RBAC. Approval infrastructure must deny by default. If the confirm path is unreachable, the action does not fire. It fails closed. The IMDA framework is explicit here: prefer structural and rule-based system-level controls over prompt-layer controls for higher-risk actions, and deny action by default when the approval infrastructure fails.

The market is converging on bounded permission, and the failures prove the inverse

This is not a Singapore-only or a guidance-only idea. The payment rails being built for agents are designed around bounded permission and accountability. Google's Agent Payments Protocol (AP2) launched on 17 September 2025 with more than 60 partners, using cryptographically signed Mandates, Verifiable Credentials, and non-repudiable audit trails from intent to cart to payment. The Stripe and OpenAI Agentic Commerce Protocol, announced 29 September 2025, powers ChatGPT Instant Checkout but keeps the business as the merchant of record, accepting or declining transactions with its own fraud signals, so accountability sits with the deployer, not the agent. Visa Intelligent Commerce embeds credentials, controls, and authentication into agent transactions. None of these is a let-the-agent-spend-freely system.

The regulation mirrors it. Where IMDA offers voluntary guidance with no penalties, the EU AI Act (Regulation (EU) 2024/1689) is binding law that does carry financial penalties. Its human-oversight (Art 14), record-keeping (Art 12), and transparency (Art 50) duties are the legal twin of the confirm-step, audit-log, and AI-disclosure stack. State the difference honestly: IMDA imposes no penalties; the EU AI Act does.

The failure data confirms the inverse. Gartner predicts that over 40 percent of agentic AI projects will be cancelled by end of 2027 (press release 25 June 2025), citing escalating costs, unclear business value, and inadequate risk controls, alongside an agent-washing critique of vendors rebranding RPA and chatbots as agents. The one real legal precedent is Moffatt v. Air Canada (BC Civil Resolution Tribunal, February 2024), where the airline was held liable for a refund policy its chatbot hallucinated. That is the canonical case of an unbounded outbound action becoming a binding, costly commitment.

The confirm step is the part most likely to fail silently

Here is the contrarian read that most agent-governance plans miss. The confirm step is the part most likely to decay, not the part most likely to save you. Everyone treats human-in-the-loop as the safety net. The net has a hole that widens with use. Anthropic's February 2026 data found that auto-approval rises with operator experience, from roughly 20 percent for new users to over 40 percent for experienced ones, while interrupt rates also rise (5 percent to 9 percent), as users shift from approving each action to active monitoring. IMDA's Dimension 2 names the same hazard and tells you to measure it: monitor the human override rate, where a rate near zero signals rubber-stamping, and response time, where very short times signal review fatigue.

For risk and compliance, the confirm step plus an immutable audit log is the single artifact that survives a legal audit, and the operational translation of EU AI Act Articles 12, 14, and 50. But a confirm step you do not measure is theatre. For the end user in the loop, the quiet risk is deskilling, the loss-of-tradecraft hazard IMDA flags in Dimension 4. If the agent drafts every quote and you only ever approve, your judgment for what a wrong quote looks like atrophies, which is the exact moment your approval becomes worthless. Keep doing a sample of Tier 2 work by hand. Treat a too-easy day of approvals as a warning, not a win.

Where Origin Pi stands

Stop debating how smart the agent is. Treat it as a capable junior employee under position limits, and structure the work by reversibility, not hype. That maps one to one onto IMDA Dimension 1, assess and bound the risks upfront, and onto our own building rule. The strongest signal that this posture is correct rather than merely cautious is that the entire 2026 market is independently converging on it: the payment rails are built around signed mandates and non-repudiable trails, the regulation is built around oversight and record-keeping, and the cancelled projects and the one real legal precedent are built around its absence.

So our reading is plain. Bounded permissions and the confirm step are the cheapest insurance any SME can buy against the most expensive class of agent failure. We build the confirm step first, not because a regulator compels it (IMDA does not, and the EU AI Act applies to a narrower set of high-risk deployers), but because it is the same control that survives an audit, caps the downside, and lets an owner book the reversible-majority return without inheriting the irreversible-minority liability. That is the agent-ready business layer thesis in one line: automate the reversible, gate the rest, log everything, and keep the human a real reviewer. The deeper craft is not adding the gate. It is designing the gate against its own decay, so a hollowed-out approver shows up as a metric before it shows up as a liability. For how this fits our wider AI governance practice, the order never changes: score by what an undo costs, then bind it where the model cannot argue back.

Questions

Common questions.

Is the IMDA Model AI Governance Framework for Agentic AI a law I have to comply with?

No. It is voluntary guidance and a living document of emerging best practice for any organisation deploying agentic AI, in-house or third-party. It is not industry-specific, carries no penalties, and is not mandatory. The binding-law equivalent is the EU AI Act (Regulation (EU) 2024/1689), whose human-oversight, record-keeping, and transparency duties mirror the same controls and do carry penalties of up to 35 million euros or 7 percent of worldwide turnover.

What should an SME automate first when deploying an AI agent?

Automate the reversible majority first. Score every action by reversibility and impact, then fully automate the low-impact, easily undone work such as password resets, status replies, draft quotes, and routine triage. In the IMDA Dayos case study, roughly 60 percent of IT tickets were reversible enough to fully automate, about 30 percent kept a human approval, and 10 percent stayed manual. The return comes from the reversible 60 percent, not from automating the scary high-stakes 10 percent first.

What is the autonomy spectrum for human involvement?

The IMDA framework defines four levels: agent proposes and human operates (human approves every action); agent and human collaborate (approval at significant steps, human can intervene anytime); agent operates and human approves (approval only at critical steps such as deleting a database or a payment above a threshold); and agent operates and human observes (no pre-approval, audited after via an immutable log). Match the level to the action's risk score and enforce it structurally.

Why is a confirm step not enough on its own?

Because it decays. Anthropic's February 2026 research found auto-approval rises with operator experience, from about 20 percent for new users to over 40 percent for experienced ones, so automation bias grows as the human gets comfortable. The IMDA framework tells you to monitor override rate, where a near-zero rate signals rubber-stamping, and response time, where very short times signal review fatigue. Design the gate to surface only actions that genuinely need a human, and instrument it so a hollowed-out approver shows up as a metric.

Should agent permissions be enforced in the prompt or at the tool layer?

At the tool-execution layer. A prompt instruction such as always ask before sending money is a suggestion the model can hallucinate around. A permission check at the tool boundary cannot. The IMDA Tencent CodeBuddy example is the template: read needs no approval, but edit, bash, webfetch, and MCP calls all require it. MCP solves tool discovery, not tool safety, so use least-privilege tools, column-level access controls, and a deny-by-default gate that fails closed if the approval path is unreachable.

What is the single artifact that protects me in a legal audit?

The confirm step on high-stakes and irreversible actions, paired with an immutable audit log. That combination is the operational translation of EU AI Act Articles 12 (record-keeping), 14 (human oversight), and 50 (transparency). The cautionary precedent is Moffatt v. Air Canada (BC Civil Resolution Tribunal, February 2024), where an unbounded chatbot action became a binding, costly commitment because nothing gated or recorded the outbound promise.

Where this connects.

Continue reading.

Work with Origin Pi.

Building the agent-ready layer for your business? Send a note. Real reply, no funnel.

Talk to us Read the thesis