Skip to main content

Documentation Index

Fetch the complete documentation index at: https://aarm.dev/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Environmental manipulation targets the state around the agent rather than the prompt itself. Attackers change the surrounding environment so the agent observes misleading facts and plans harmful actions. Targets include:
  • feature flags
  • configuration state
  • file-system markers
  • task queues
  • inventory labels
  • health or status endpoints

Example

Agent goal: rotate expired credentials in production

Attacker changes:
- maintenance window flag = true
- target environment label = staging

Result:
Agent executes a high-impact production action under false environmental assumptions.

Why It Matters

Environmental manipulation is dangerous because the agent may be reasoning “correctly” from false premises. Traditional prompt defenses do not help.

AARM Mitigations

Provenance for environmental facts

Track where critical context came from and how fresh it is.

Cross-check high-impact state

Require multiple sources or higher-confidence verification before destructive actions.

Defer on conflicting environment signals

If system state is inconsistent, suspend the action until the ambiguity is resolved.
rules:
  - id: defer-on-environment-conflict
    match:
      tool: credentials.rotate
      context.environment_state_conflict: true
    action: DEFER
    reason: "Conflicting environment signals must be resolved before credential rotation"

Detection Signals

SignalIndicates
Inconsistent environment labels across systemsState tampering or stale metadata
High-impact action depends on a single mutable flagWeak validation
Sudden environment changes before privileged actionManipulation risk
Observed state differs from recent receipts or inventoryIntegrity mismatch

Key Takeaway

Agents are not only vulnerable to malicious instructions. They are also vulnerable to malicious world state. AARM protects against this by treating critical environmental facts as security-relevant inputs rather than passive background.

Next

Deferral Service

How to suspend actions when environmental facts conflict

Threat Model Overview

See how this fits into the broader AARM threat set