Skip to content
Back to blogEngineering

Designing AI agents that escalate well

The hardest part of an autonomous agent is teaching it when not to be autonomous.

March 8, 2026 9 min read
Engineering

Most AI agent failures we see in production aren't the agent doing the wrong thing — they're the agent doing something when it should have stopped to ask.

Good escalation design starts with confidence scoring. Every agent decision should produce a confidence number, and you should know empirically what threshold separates 'just do it' from 'check with a human'.

We typically calibrate this against historical human decisions. Run the agent against last quarter's tickets in shadow mode, measure where it agreed and disagreed with humans, and pick a threshold that balances autonomy and safety.

Beyond confidence, design escalation paths that respect human time. Bad escalations are vague ("this needs review"); good escalations include the agent's reasoning, the alternatives it considered, and a recommended action with one click to accept.

Get started

Want this in your inbox?

We email occasionally — when there's something genuinely useful to share. No spam.