AI governance
Bound the Risk Before You Deploy: How to Tier Agentic AI by Reversibility, Not Hype
Treat an AI agent as a capable junior employee under position limits. Score every action by what an undo costs, automate the reversible majority, and gate the rest. That is the cheapest insurance an SME can buy.
By Siddharth Surana, Founder & CEO / / 8 min read
Bound the risk before you deploy an agent, and bound it by reversibility, not by how clever the agent feels. The single most useful move when putting an agent into production is to stop treating it as software that occasionally errs and start treating it as a capable junior employee under position limits. A junior reads the file, drafts the quote, formats the reply. A junior never signs a binding contract or moves money without a manager's signature. The confirm step is not a software limitation. It is ordinary corporate governance applied to a synthetic employee, the same four-eyes principle and position limits a trading desk uses, where nobody trusts the algorithm. They trust the circuit breakers. This is the posture the IMDA Model AI Governance Framework for Agentic AI puts first as Dimension 1: assess and bound the risks upfront. That framework is voluntary guidance and a living document, not law and not industry-specific. Its first instruction is still the one worth building around.
Risk is likelihood times impact, and impact is mostly reversibility
The IMDA framework defines agent risk as likelihood times impact, then names the factors for each. Impact factors are the domain's error tolerance, access to sensitive data, access to external systems, the scope of an action (read versus write), and reversibility. Likelihood factors are the autonomy level, task complexity, external exposure, whether the system is provided or operated by an external party, and system complexity such as multi-agent setups.
Of those impact factors, reversibility is the one that should drive your entire deployment map, and the field data explains why. Anthropic's Measuring agent autonomy in practice (18 February 2026) found that across real agent usage, 80 percent of tool calls carry at least one safeguard, 73 percent have a human in the loop, and only 0.8 percent of actions are actually irreversible, such as sending an email to a customer. Tier by how autonomous an agent feels and every task looks risky. Tier by what cannot be undone and the genuinely dangerous slice turns out to be tiny and nameable, while the overwhelming majority of work is safe to automate freely. The hype frames the question as how much do we trust the AI. The institutional-grade frame is how much does an undo cost. Those two questions produce very different deployment plans.
The autonomy spectrum, and the control each level requires
The IMDA framework sets out four levels of human involvement. The job is to match the level to the action's risk score, then enforce that match structurally. This is the required mapping.
| Human-involvement level | When to use it | Example action | Control required |
|---|---|---|---|
| Agent proposes, human operates | High impact or low error tolerance, early rollout, untrusted data | Agent drafts a customer refund decision; human executes it | Human approves every action before it fires |
| Agent and human collaborate | Medium risk with some write access; human can intervene anytime | Agent prepares a database write or a payment, pauses at the significant step | Approval at significant steps, human can interrupt at any point |
| Agent operates, human approves | Mostly reversible work with a few critical, high-stakes steps | Agent runs routine triage but stops before deleting a database or a payment above a set amount | Approval only at critical or above-threshold steps |
| Agent operates, human observes | Fully reversible, low-impact, well-tested tasks | Agent sends a status reply or resets a password | No pre-approval; immutable log audited after the fact |
The same impact and likelihood factors that score the action also tell you which row it belongs in. A short reference of IMDA's Dimension 1 factors:
| Impact factors | Likelihood factors |
|---|---|
| Reversibility (can the action be undone) | Autonomy level |
| Access to sensitive data | Task complexity |
| Access to external systems | External exposure |
| Scope: read versus write | Provided or operated by an external party |
| Domain error tolerance | System complexity (multi-agent) |
The Dayos worked example: tier the tickets, automate the reversible
The IMDA document includes Dayos as a real case study. Dayos tiered its IT tickets by risk and acted on the tiers. Roughly 60 percent of tickets were reversible enough to fully automate (Tier 1). About 30 percent kept a human approval (Tier 2). The remaining 10 percent stayed manual (Tier 3). On the strength of that, Dayos replaced its ServiceNow instance and reduced legacy licensing cost by 121,000 dollars annually.
The number that matters is not the 121,000 dollars. It is the tier mix that made the cut safe. The return-on-investment lever is the reversible 60 percent, not the agent. Most owners try to automate the impressive 10 percent first, the irreversible high-stakes work, because that is where the pain feels biggest. That is exactly the wrong order. Automate the boring reversible majority first: password resets, status replies, draft quotes, routine triage. Reversible work compounds value with near-zero downside, and it builds the audit history that earns the trust to widen the boundary later. The confirm step on the middle tier is not a friction tax. It is what lets you book the Tier 1 savings without owning Tier 3 liability.
How to score and bound it: structural gates, not prompt suggestions
For the technical lead, the rule is to score every action, not the agent, against the Dimension 1 impact factors, then bind structurally rather than in the prompt. A prompt that says always ask before sending money is a suggestion the model can hallucinate around. A permission check at the tool-execution layer cannot be talked out of. The IMDA framework's Tencent CodeBuddy case is the clean template: read requires no approval, but edit, bash, webfetch, and MCP calls all require it, enforced at the tool boundary.
Critically, MCP solves tool discovery, not tool safety. A whitelisted MCP server still happily executes a hallucinated parameter, so the bound has to sit at execution, with least-privilege tools and, in our own builds, column-level RLS or RBAC. Approval infrastructure must deny by default. If the confirm path is unreachable, the action does not fire. It fails closed. The IMDA framework is explicit here: prefer structural and rule-based system-level controls over prompt-layer controls for higher-risk actions, and deny action by default when the approval infrastructure fails.
The market is converging on bounded permission, and the failures prove the inverse
This is not a Singapore-only or a guidance-only idea. The payment rails being built for agents are designed around bounded permission and accountability. Google's Agent Payments Protocol (AP2) launched on 17 September 2025 with more than 60 partners, using cryptographically signed Mandates, Verifiable Credentials, and non-repudiable audit trails from intent to cart to payment. The Stripe and OpenAI Agentic Commerce Protocol, announced 29 September 2025, powers ChatGPT Instant Checkout but keeps the business as the merchant of record, accepting or declining transactions with its own fraud signals, so accountability sits with the deployer, not the agent. Visa Intelligent Commerce embeds credentials, controls, and authentication into agent transactions. None of these is a let-the-agent-spend-freely system.
The regulation mirrors it. Where IMDA offers voluntary guidance with no penalties, the EU AI Act (Regulation (EU) 2024/1689) is binding law that does carry financial penalties. Its human-oversight (Art 14), record-keeping (Art 12), and transparency (Art 50) duties are the legal twin of the confirm-step, audit-log, and AI-disclosure stack. State the difference honestly: IMDA imposes no penalties; the EU AI Act does.
The failure data confirms the inverse. Gartner predicts that over 40 percent of agentic AI projects will be cancelled by end of 2027 (press release 25 June 2025), citing escalating costs, unclear business value, and inadequate risk controls, alongside an agent-washing critique of vendors rebranding RPA and chatbots as agents. The one real legal precedent is Moffatt v. Air Canada (BC Civil Resolution Tribunal, February 2024), where the airline was held liable for a refund policy its chatbot hallucinated. That is the canonical case of an unbounded outbound action becoming a binding, costly commitment.
The confirm step is the part most likely to fail silently
Here is the contrarian read that most agent-governance plans miss. The confirm step is the part most likely to decay, not the part most likely to save you. Everyone treats human-in-the-loop as the safety net. The net has a hole that widens with use. Anthropic's February 2026 data found that auto-approval rises with operator experience, from roughly 20 percent for new users to over 40 percent for experienced ones, while interrupt rates also rise (5 percent to 9 percent), as users shift from approving each action to active monitoring. IMDA's Dimension 2 names the same hazard and tells you to measure it: monitor the human override rate, where a rate near zero signals rubber-stamping, and response time, where very short times signal review fatigue.
For risk and compliance, the confirm step plus an immutable audit log is the single artifact that survives a legal audit, and the operational translation of EU AI Act Articles 12, 14, and 50. But a confirm step you do not measure is theatre. For the end user in the loop, the quiet risk is deskilling, the loss-of-tradecraft hazard IMDA flags in Dimension 4. If the agent drafts every quote and you only ever approve, your judgment for what a wrong quote looks like atrophies, which is the exact moment your approval becomes worthless. Keep doing a sample of Tier 2 work by hand. Treat a too-easy day of approvals as a warning, not a win.
Where Origin Pi stands
Stop debating how smart the agent is. Treat it as a capable junior employee under position limits, and structure the work by reversibility, not hype. That maps one to one onto IMDA Dimension 1, assess and bound the risks upfront, and onto our own building rule. The strongest signal that this posture is correct rather than merely cautious is that the entire 2026 market is independently converging on it: the payment rails are built around signed mandates and non-repudiable trails, the regulation is built around oversight and record-keeping, and the cancelled projects and the one real legal precedent are built around its absence.
So our reading is plain. Bounded permissions and the confirm step are the cheapest insurance any SME can buy against the most expensive class of agent failure. We build the confirm step first, not because a regulator compels it (IMDA does not, and the EU AI Act applies to a narrower set of high-risk deployers), but because it is the same control that survives an audit, caps the downside, and lets an owner book the reversible-majority return without inheriting the irreversible-minority liability. That is the agent-ready business layer thesis in one line: automate the reversible, gate the rest, log everything, and keep the human a real reviewer. The deeper craft is not adding the gate. It is designing the gate against its own decay, so a hollowed-out approver shows up as a metric before it shows up as a liability. For how this fits our wider AI governance practice, the order never changes: score by what an undo costs, then bind it where the model cannot argue back.
Common questions.
Is the IMDA Model AI Governance Framework for Agentic AI a law I have to comply with?
What should an SME automate first when deploying an AI agent?
What is the autonomy spectrum for human involvement?
Why is a confirm step not enough on its own?
Should agent permissions be enforced in the prompt or at the tool layer?
What is the single artifact that protects me in a legal audit?
Where this connects.
Continue reading.
Work with Origin Pi.
Building the agent-ready layer for your business? Send a note. Real reply, no funnel.