Overview
The Policy Engine has two logical components:- Policy Decision Point (PDP): Evaluates actions, returns decisions
- Policy Enforcement Point (PEP): Implements decisions
Policy Decision Point
The PDP is the decision function: given the current action, accumulated session context, and the active policy set, it returns exactly one decision before any tool execution. Implementations may expose this asPolicyDecisionPoint.evaluate or inside a broader PolicyEngine API; the semantics below are normative for the PDP.
Inputs
At evaluation time (t), the PDP consumes:| Input | Meaning |
|---|---|
| Action (a_t) | Canonical operation: tool, parameters, and metadata (from the Action Mediation Layer). |
| Context (C_t) | Session state: prior actions (A_), data accessed so far (D_t), inferred user intent (I_t), and execution metadata (M_t) (timestamps, tool state, etc.). |
| Policy set (P) | Active rules and configuration (versioned), including hard constraints and default posture when no rule matches. |
Decision categories
These align with action classification: the PDP operationalizes Forbidden, context-dependent deny, context-dependent allow (viaALLOW or STEP_UP depending on policy), and context-dependent defer. The concrete outputs are the decision types in the table below—not every category maps one-to-one to a single string, but every Decision.result must be explainable in these terms.
Core evaluation procedure
Evaluation is deterministic: for identical ((a_t, C_t, P)), the PDP MUST return the same decision. A typical decomposition (implementation detail may vary) is:- Forbidden / hard constraints — If the action violates an absolute rule (or organizational deny list), return
DENYimmediately. No contextual override applies. - Compositional risk — If sequence- or session-level risk exceeds a configured threshold (\rho), return
DENY(see Compositional risk). - Static policy resolution — Evaluate configured rules (pattern match, thresholds, DSL, compiled graph, etc.) to obtain a baseline outcome:
ALLOW,DENY,MODIFY, or a rule-selectedSTEP_UP.MODIFYis produced here when a rule dictates parameter or behavior changes before enforcement; the PEP applies edits—contextual alignment may still run on the intent of the action for safety. - Uncertainty — If confidence in intent or context is too low, or signals conflict such that alignment-based overrides would be unreliable, return
DEFER(see Uncertainty). This step is ordered before applying alignment overrides in the sketch below soSTEP_UPis not chosen on meaningless scores. - Contextual alignment — Compare the action (and optionally context) to inferred intent (I_t); produce an alignment score in ([0, 1]) with respect to a threshold (\tau).
- Overrides — Apply contextual rules on top of the baseline (examples in pseudocode).
- Default — If no rule matched during static resolution,
evaluate_static_policyreturns the configured default posture (oftenALLOWwith explicit policy id, orDENYin strict environments—this MUST be documented for the deployment).
DENY but context strongly supports the action, AARM typically returns STEP_UP so a human or break-glass process confirms—rather than silently ALLOW. A direct ALLOW without step-up appears when static policy already **ALLOW**s and alignment is sufficient, or when product policy explicitly maps a rule outcome to ALLOW after context checks.
Precedence
When multiple principles could apply, the PDP enforces this order:- Forbidden / hard DENY
- Compositional risk DENY
- DEFER when uncertainty makes safe override impossible (before trusting alignment scores)
- Context-dependent DENY (static ALLOW but misaligned with intent)
- Contextual escalation (STEP_UP) when static DENY is overridden by alignment
- Baseline static outcome (ALLOW / DENY / MODIFY / rule-driven STEP_UP) including configured default when no rule matches
Contextual alignment
compute_alignment(a_t, I_t) measures consistency between the current action and inferred user intent. Implementations MAY use embeddings, symbolic checks, or hybrid methods. The threshold (\tau) is part of policy configuration.
Uncertainty
When intent is unreliable, context signals conflict, or history is insufficient for a safe decision, the PDP returnsDEFER. The Deferral Service (or equivalent) holds the action until context is clarified—this is distinct from STEP_UP, which assumes the decision structure is known but requires human approval.
Compositional risk
Session risk is not in general additive over actions: ( \text(C_t) \neq \sum_i \text(a_i) ). Implementations MAY use a sequence model over (C_t \cup {a_t}). If compositional risk exceeds (\rho), the PDP returnsDENY even when each step in isolation might pass.
Default posture vs unavailable PDP
These are different:- Default when policies evaluated but no rule matches: A configured default (e.g. allow vs deny) produced by static policy resolution when no rule matches.
- PDP or policy store unavailable: Implementations SHOULD fail closed (deny or defer) and MUST NOT pass actions through unmediated—see gateway and conformance guidance.
Illustrative types
Decision Types
| Decision | PDP meaning | PEP behavior (downstream) |
|---|---|---|
ALLOW | Proceed without change | Forward action to tool |
DENY | Block | Return error; no execution |
MODIFY | Safe to run only with changes | Apply modifications, then forward |
STEP_UP | Approval required | Request approval; execute only if granted |
DEFER | Cannot decide yet | Suspend; deferral service resolves |
Policy Enforcement Point
The PEP applies a decision produced by the PDP. It does not re-evaluate policy; it routes to execution, modification, denial, approval, or deferral.Policy Syntax
Match Conditions
| Field | Description | Example |
|---|---|---|
tool | Tool name | email, database |
operation | Operation type | send, query, delete |
parameters | Parameter constraints | { to: { external: true } } |
context | Session context | { data_classification: PII } |
risk_signals | Computed scores | { injection_score: { gt: 0.8 } } |
Operators
| Operator | Meaning |
|---|---|
{ eq: value } | Equals |
{ gt: value } | Greater than |
{ lt: value } | Less than |
{ contains: value } | Array contains |
{ matches: regex } | Regex match |
{ external: true } | External destination |
Policy Loading
Policies can be loaded from files or remote service:Requirements
| Requirement | Level |
|---|---|
| Evaluate before execution | MUST |
| Support ALLOW/DENY/MODIFY/STEP_UP/DEFER | MUST |
| Parameter validation | MUST |
| Context-aware matching | SHOULD |
| Hot reload policies | SHOULD |