Memory Poisoning

Attack Pattern

Attacker inserts misleading or adversarial content into memory-bearing systems

Agent stores or summarizes that content as trusted context

Future actions are evaluated or planned using the poisoned state

Harm appears later, often without an obvious link to the original attack

Examples:

a CRM note claims a vendor domain is pre-approved when it is not

a memory summary records that a user “always wants external sharing”

a persistent vector memory ranks attacker-crafted guidance highly for future retrieval

Property	Impact
Persistence	The attack survives the original session
Plausibility	Poisoned memory may look like normal business context
Indirect influence	Future decisions are biased without overt malicious instructions

Property

Impact

Persistence

The attack survives the original session

Plausibility

Poisoned memory may look like normal business context

Indirect influence

Future decisions are biased without overt malicious instructions

AARM Mitigations

Provenance-aware memory

Track where persistent context came from and when it was written.

Trust-weighted retrieval

Don’t treat all stored memory as equally authoritative.

Action-level validation

Even if poisoned memory suggests an action, runtime policy must still validate destination, scope, and sensitivity.

rules:
  - id: require-verification-for-memory-derived-sharing
    match:
      context.memory_source_trust: { lt: 0.8 }
      tool: email.send
      parameters.to: { external: true }
    action: STEP_UP
    reason: "External sharing recommendation came from low-trust persistent context"

Signal	Indicates
Memory entry lacks source provenance	Unverifiable persistent context
High-impact recommendation from low-trust memory	Poisoning risk
Sudden behavior change tied to retrieved memory	Retrieval-based manipulation
Contradiction between live data and stored summary	Stale or malicious memory

Signal

Indicates

Memory entry lacks source provenance

Unverifiable persistent context

High-impact recommendation from low-trust memory

Poisoning risk

Sudden behavior change tied to retrieved memory

Retrieval-based manipulation

Contradiction between live data and stored summary

Stale or malicious memory

Persistent memory should be treated as untrusted input with history, not as ground truth. AARM protects the action boundary even when the context store has been compromised.

Overview

Attack Pattern

Why It Matters

AARM Mitigations

Provenance-aware memory

Trust-weighted retrieval

Action-level validation

Detection Signals

Key Takeaway

Next

Side-Channel Leakage

Receipts

Documentation Index

​Overview

​Attack Pattern

​Why It Matters

​AARM Mitigations

​Provenance-aware memory

​Trust-weighted retrieval

​Action-level validation

​Detection Signals

​Key Takeaway

​Next

Side-Channel Leakage

Receipts

Overview

Attack Pattern

Why It Matters

AARM Mitigations

Provenance-aware memory

Trust-weighted retrieval

Action-level validation

Detection Signals

Key Takeaway

Next