The least sexy checklist that will keep your agent from burning down the org

Enterprise AI is no longer a thought experiment. Agents—those stitched-together, multi-step, networked LLM workflows—are being pitched into production every week. But here’s the thing: most of the risk isn’t in the model. It’s in the plumbing, the permissions, and the way you let a chain of calls loose on corporate systems.

Problem: agents do a lot, and permissions are fuzzy

Agents are powerful because they can call tools, read documents, and act—sometimes across services and cloud boundaries. That capability is also a liability. One mis-scoped permission or a too-handy internet fetch and an agent can leak secrets, corrupt records, or trigger actions someone forgot to gate.

Why this keeps happening

People treat agents like glorified macros. They hand them tokens, point them at a repo or a calendar, and assume the AI will behave. But agents combine several failure modes: lateral API calls, credential reuse, over-broad retrievals, and opaque decision logic. Add emergent planning that reorders steps and you have a machine that can escalate a tiny read access into a write operation across services.

Mechanism: where the plumbing exposes you

Tool chaining: Each step often needs a credential. If the agent holds a single long-lived token, every tool it touches becomes a blast radius.
Implicit trust in embeddings/RAG: Retrieval systems blur context boundaries; agents may confidently act on stale or incorrectly sourced data.
Action equivocation: Natural language leaves room for interpretation—”update” can mean append, overwrite, or delete.
Monitoring gaps: Observability is often built for humans, not for opaque, multi-hop agent traces.

Checklist playbook: what to do tomorrow

Skip the long policy drafts for now. Do these seven concrete things inside a week and you’ll massively reduce risk.

1) Least-privilege tool tokens: Issue short-lived, scoped tokens per tool per agent instance. If you can, tie them to the agent run and revoke on completion.
2) Action capability model: Explicitly register every action an agent can perform (read, list, create, update, delete), and require an allowlist lookup before the agent executes a step.
3) Decision provenance headers: Force agents to emit structured reasons with each external call: what it asked, why, and what it expects to do with the result.
4) RAG source tagging: When using retrieval, attach strong metadata to results (source id, freshness, trust score). Treat any low-trust result as “context only; human review required.”
5) Human-in-the-loop gates: For destructive verbs (delete, modify production, send email), require a human confirmation token—ideally one-time use and recorded.
6) Canary runs and simulation mode: Run agents in a simulated environment with canned responses before live runs. Compare planned against observed actions and block deviations.
7) Audit-first telemetry: Log every step with immutable IDs and make the full trace available for quick playback. Not just status codes—log inputs, model traces, and final decisions.

Concrete examples (short)

Example A: An agent is given access to a CRM to “clean bad contacts.” With least-privilege tokens you give it read/list and a sandboxed update queue. It can propose merges, but writes require a human confirmation token. That single change in flow prevents accidental mass-deletes.

Example B: An agent integrates with cloud infra. Rather than giving it an org-wide cloud role, give it a task-scoped role that can only touch a single project. Use simulation to validate that the plan won’t escalate to org-level operations.

Pitfalls people ignore

Over-reliance on single-run explainability: A single human-readable explanation from a model is not adequate provenance.
Assuming embedding trust: If your retrieval includes user-uploaded docs, treat them as untrusted—especially when the agent can act on them.
Rewarding speed not safety: KPIs that prize agent throughput will bias engineers to widen permissions instead of tightening them.
Fuzzy roles: When teams own different services, no one owns the agent’s permissions. The result: cross-team blame and drift.

Next action (30–90 mins)

Run a quick inventory: list every agent in dev or prod and for each record: the tokens it holds, the actions it can perform, and the sources it queries. If that list is more than three lines, schedule a 90-minute remediation sprint. Start with short-lived tokens and a single human gate on destructive actions.

Why this matters

Agents are already making the enterprise faster. They can also make enterprise mistakes cheaper—if you treat them like toys. A pragmatic checklist, implemented as code (not wordy policy), buys you time to adopt better tooling, monitoring, and evaluation practices.

Make the plumbing boring again. Safety is the feature people stop noticing when it works.

The least sexy checklist that will keep your agent from burning down the org