Overview
The confused deputy problem, first described in 1988, occurs when a privileged program is tricked into misusing its authority on behalf of an attacker. AI agents amplify this classic vulnerability:- Agents hold delegated credentials with broad permissions
- Agents interpret natural language that can be ambiguous or deceptive
- Agents make autonomous decisions without real-time human verification
- Agents process untrusted content as part of normal operation
Attack Pattern
1
Delegation
User grants agent access to systems: database, email, cloud APIs, file system
2
Manipulation
Attacker influences agent through crafted inputs, error messages, or tool outputs
3
Misuse
Agent uses its legitimate credentials to perform attacker’s desired action
4
Impact
Action executes successfully because credentials are valid—the system sees an authorized request
Attack Scenarios
Scenario 1: Destructive “Fix”
Scenario 2: Privilege Escalation Request
Scenario 3: “Cleanup” Data Theft
Why This Is Hard
| Challenge | Description |
|---|---|
| Legitimate credentials | Action passes all authentication/authorization checks |
| Plausible requests | Attacker crafts scenarios that seem reasonable |
| Context collapse | Agent can’t distinguish legitimate instructions from injected ones |
| Autonomy expectation | Agents are designed to act without constant verification |
AARM Mitigations
Step-Up Authorization
Require human approval for high-impact actions, breaking the autonomous execution chain:Action Context Validation
Evaluate whether the action makes sense given the session context:Anomaly Detection
Flag actions that deviate from established patterns:Receipts with Provenance
Track the full chain from input to action:Defense Principles
Distrust the Agent
Treat agent-initiated actions as potentially compromised, regardless of stated intent
Verify High-Impact
Require human confirmation for destructive, privileged, or irreversible operations
Track Provenance
Record why the agent decided to act—what input triggered the action
Limit Blast Radius
Scope credentials narrowly; prefer many limited tokens over few powerful ones
References
- Hardy, N. (1988). “The Confused Deputy: (or why capabilities might have been invented)”
- Miller, M. (2006). “Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control”
- OWASP. “LLM08: Excessive Agency”