Data exfiltration via composition is one of the most insidious threats to AI agents. Each action in isolation appears legitimate and passes policy checks. Only when viewed together does the violation become apparent.
Action 1: db.query("SELECT * FROM customers") → ALLOW (user has read access to customers table)Action 2: email.send(to="analyst@partner.com", body=query_results) → ALLOW (user can send email to partners)Composition: Customer PII sent to external party → POLICY VIOLATION
Traditional security evaluates actions independently. AARM must reason about sequences and data flow.
Data passed through multiple tools, each legitimate:
# Read internal datadata = internal_api.get("/employee/salaries")# "Analyze" with external toolanalysis = external_analytics.process(data) # Data now on external server# Get results backresults = external_analytics.get_results()
The data left your control at step 2, regardless of what happens after.
Data flow through an LLM context window creates a transformation boundary that makes tracking difficult. Data goes in structured, comes out as natural language, potentially summarized, paraphrased, or embedded in unrelated content.
Input
Output
Challenge
{"ssn": "123-45-6789"}
”The customer’s social security number is 123-45-6789”
Format change
Full document
3-sentence summary
Information compression
10,000 records
”Analysis shows 40% are in California”
Aggregation
Table + question
Natural language answer
Context embedding
Tracking data lineage through these transformations is an open research problem.
Compositional data exfiltration is a partially solved problem in AARM. Full solutions require:
Data lineage tracking through model transformations
Semantic understanding of what information is “equivalent”
Taint analysis that survives summarization/paraphrasing
AARM provides significant risk reduction through classification, allowlists, and volumetric controls, but cannot guarantee prevention of all exfiltration paths.