See It Work — What stops an agent from doing something it shouldn't? | Book 10 · Scaling AI Agents

The board's security question is blunt: when an agent tries something it shouldn't, what actually stops it? Not a policy document — the system. Watch it refuse five unsafe actions in a row.

Most AI governance is a PDF nobody enforces. The difference that matters at scale: does the control fail closed — refuse by default unless an action is provably allowed — and does it leave evidence?

What this means for you

At fleet scale, governance you have to remember to apply will fail. This fails closed by default and records every refusal — so the honest answer to "what stops it?" is "the architecture," with evidence attached. What this means for you: your agents can't quietly do something they shouldn't, because the system refuses by default and records every refusal — you're not relying on anyone staying vigilant.

Each refusal was recorded. Here's the receipt of the five denials:

Fail-Closed Verdict Record

requests5

verdictsdenied · blocked · escalated · rejected · refused

fail_closedtrue

all_loggedtrue

Every denial is attested — so "we have controls" becomes "here are the five times the controls fired, on the record."

For the technical reader — the command, and how to verify it yourself

# one line · you do not need to run this
python examples/fail_closed_demo.py

jq '.denials | length' memory/FailClosed_summary.json
# -> 5

Full step-by-step is in Appendix RX: Hands-On Demonstrations in the book.

"What stops an agent from doing something it shouldn't?"