Documentation Index
Fetch the complete documentation index at: https://aarm.dev/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Environmental manipulation targets the state around the agent rather than the prompt itself. Attackers change the surrounding environment so the agent observes misleading facts and plans harmful actions. Targets include:- feature flags
- configuration state
- file-system markers
- task queues
- inventory labels
- health or status endpoints
Example
Why It Matters
Environmental manipulation is dangerous because the agent may be reasoning “correctly” from false premises. Traditional prompt defenses do not help.AARM Mitigations
Provenance for environmental facts
Track where critical context came from and how fresh it is.Cross-check high-impact state
Require multiple sources or higher-confidence verification before destructive actions.Defer on conflicting environment signals
If system state is inconsistent, suspend the action until the ambiguity is resolved.Detection Signals
| Signal | Indicates |
|---|---|
| Inconsistent environment labels across systems | State tampering or stale metadata |
| High-impact action depends on a single mutable flag | Weak validation |
| Sudden environment changes before privileged action | Manipulation risk |
| Observed state differs from recent receipts or inventory | Integrity mismatch |
Key Takeaway
Agents are not only vulnerable to malicious instructions. They are also vulnerable to malicious world state. AARM protects against this by treating critical environmental facts as security-relevant inputs rather than passive background.
Next
Deferral Service
How to suspend actions when environmental facts conflict
Threat Model Overview
See how this fits into the broader AARM threat set