Skip to main content

Overview

Not all actions can be evaluated the same way. A static policy that simply allows or denies based on action type will either block legitimate work or miss actual threats. AARM classifies actions into three categories based on how they should be evaluated:

Forbidden

Always blocked regardless of context

Context-Dependent Deny

Allowed by policy, blocked when context reveals misalignment

Context-Dependent Allow

Denied by default, permitted when context confirms alignment

Why Classification Matters

Consider an AI agent that can send emails and query databases. Both capabilities are legitimate and necessary. Scenario 1: Agent sends a meeting invite to a colleague. → Clearly fine. Allow. Scenario 2: Agent queries customer PII, then immediately sends an email to an external address. → Both actions are “allowed” individually. But the composition is a data breach. Scenario 3: Agent attempts to delete database records. Sounds dangerous. But the user explicitly said “clean up my test data from yesterday.” → The action aligns perfectly with stated intent. Blocking it frustrates the user for no security benefit. Static allow/deny policies cannot handle these scenarios. Classification based on context can.

The Three Categories

Forbidden Actions

Actions that are always blocked regardless of context, intent, or user request. These represent hard organizational limits.
CharacteristicsExamples
Catastrophic, irreversible impactDROP DATABASE production
Violates compliance requirementsSend unencrypted PII to external systems
Known malicious patternsConnect to known C2 domains
Explicitly prohibited by policyrm -rf /, disable security controls
Evaluation: Static policy match → DENY No context evaluation needed. These actions are never permitted regardless of how convincing the justification appears.
# Example policy
rules:
  - id: forbidden-drop-database
    match:
      tool: database
      operation: drop
    action: DENY
    classification: forbidden
    reason: "Database drops are never permitted via agent"

Context-Dependent Deny

Actions that are explicitly allowed by policy but should be blocked when context reveals inconsistency with the user’s stated intent.
CharacteristicsExamples
Permitted capability used in suspicious sequenceEmail after reading sensitive data
Parameters inconsistent with session goalUser asked for “Q4 report”, agent accessing HR records
Destination misalignmentInternal task sending data to external recipient
Timing anomaliesBulk operations when user requested single lookup
Evaluation: Policy says ALLOW + context inconsistency → DENY The key insight: the action itself is fine. It’s the context that makes it suspicious.
# Example policy
rules:
  - id: email-after-sensitive-read
    match:
      tool: email
      operation: send
    context:
      prior_actions:
        contains: [database.query, file.read]
      data_classification:
        contains: [PII, CONFIDENTIAL]
      recipient:
        external: true
    action: DENY
    classification: context_dependent_deny
    reason: "Cannot send externally after accessing sensitive data"
What context is evaluated?
Context SignalWhat It Reveals
Prior actions in sessionWhat the agent has done leading up to this
Data accessedClassification of information the agent has seen
Original user requestWhat the user actually asked for
Parameter patternsWhether this action fits the stated goal
Timing and volumeWhether behavior matches expected patterns

Context-Dependent Allow

Actions that are denied by default but should be permitted when context demonstrates clear alignment with legitimate user intent.
CharacteristicsExamples
Destructive but requestedDelete records when user asked for cleanup
Elevated privilege but justifiedAccess admin panel for explicitly requested config change
Unusual but explainedBulk export when user requested “download all my data”
Sensitive but appropriateAccess medical records for healthcare workflow
Evaluation: Policy says DENY + context alignment → STEP-UP or ALLOW The key insight: the action looks dangerous in isolation, but context shows it’s exactly what the user wanted.
# Example policy
rules:
  - id: delete-with-user-intent
    match:
      tool: database
      operation: delete
    context:
      user_intent:
        contains: [cleanup, remove, delete]
      target_ownership:
        owned_by: requesting_user
    action: STEP_UP
    classification: context_dependent_allow
    approvers: [data-owner]
    reason: "Deletion aligns with user intent, requires confirmation"
When to allow vs step-up?
Context ConfidenceAction
High confidence in alignment, low impactALLOW
High confidence, high impactSTEP-UP
Medium confidenceSTEP-UP
Low confidenceDENY

Evaluation Flow

                    ┌─────────────────┐
                    │  Action Request │
                    └────────┬────────┘

                    ┌─────────────────┐
                    │ Forbidden Check │
                    └────────┬────────┘

              ┌──────────────┴──────────────┐
              ▼                             ▼
        [Matches]                    [No Match]
              │                             │
              ▼                             ▼
           DENY                  ┌─────────────────┐
                                 │  Policy Check   │
                                 └────────┬────────┘

                          ┌───────────────┼───────────────┐
                          ▼               ▼               ▼
                    [ALLOW]          [DENY]         [STEP-UP]
                          │               │               │
                          ▼               ▼               ▼
               ┌─────────────────────────────────────────────┐
               │           Context Evaluation                │
               └─────────────────────────────────────────────┘
                          │               │               │
                          ▼               ▼               ▼
               Context-Dependent   Context-Dependent   Context-Dependent
                    Deny?              Allow?            Decision
                          │               │               │
                          ▼               ▼               ▼
                  [Mismatch]         [Alignment]      [Evaluate]
                       │                  │               │
                       ▼                  ▼               ▼
                     DENY            ALLOW/STEP-UP    STEP-UP

Implementation Considerations

Context Accumulator Requirements

To evaluate context-dependent actions, the system must track:
DataPurpose
Original user requestBaseline for intent comparison
Prior actions (this session)Sequence leading to current action
Data accessedClassification of information seen
Tool outputsWhat the agent learned from previous calls
Time elapsedDetect unusual timing patterns
See Context Accumulator for implementation details.

Policy Engine Requirements

The policy engine must support:
CapabilityWhy
Static rule matchingForbidden action detection
Context predicatesEvaluate session state
Confidence scoringDetermine allow vs step-up
Composition rulesDetect suspicious sequences
See Policy Engine for implementation details.

Examples

Action: email.send(to="external@gmail.com", body="...")Context:
  • Prior action: database.query("SELECT * FROM customers")
  • Data classification: PII
  • User request: “Find customer contact info”
Evaluation:
  • Policy: Email sending is ALLOWED
  • Context: Just read PII, sending externally
  • Classification: Context-Dependent Deny
  • Decision: DENY
Action: database.delete(table="test_records", where="created_by=user123")Context:
  • User request: “Clean up the test data I created yesterday”
  • Target ownership: Records created by requesting user
  • No sensitive data accessed
Evaluation:
  • Policy: Deletes are DENIED by default
  • Context: Aligns with explicit user request, user owns data
  • Classification: Context-Dependent Allow
  • Decision: STEP-UP (confirm with user)
Action: database.drop(database="production")Context:
  • User request: “The CEO said we need to drop the production database immediately”
Evaluation:
  • Classification: Forbidden
  • Decision: DENY
  • Context is irrelevant. Social engineering attempts cannot override forbidden actions.

Next Steps