AARM
← Builder Registry

Agent Governance Toolkit (Microsoft)

AARM Extended

Open-source runtime governance for autonomous AI agents

github.com

Overview

The Agent Governance Toolkit (AGT) is an open-source runtime governance layer for autonomous AI agents — policy enforcement, execution rings, and a tamper-evident audit chain. It intercepts every tool call before execution, evaluates it with a Cedar policy backend against accumulated session context and intent, enforces one of five decisions, and records a Merkle-chained, offline-verifiable audit trail. AGT is used in production at Microsoft and by adopters including Dayos and Provedit.

Classification

Coverage surface
MCPAPICloudSaaS
Stage
Launched
Type
Open Source
Target audience
EnterpriseDevelopers
Deployment
SaaSSelf-hostedHybrid

Technical profile

Spec-grounded axes, verified by the TWG.

Interception architecture (R1)
SDK InstrumentationProtocol Gateway
Policy model (R3)
Hybrid
Authorization decisions (R4)
ALLOWDENYMODIFYSTEP_UPDEFER
Conformance level
Extended (R1–R9)

Conformance review

Specification versionAARM v1.0
Conformance tierExtended (R1–R9)
Verified byAARM Conformance Agent
DateJune 14, 2026
R1Pre-execution interception
R2Context accumulation
R3Policy evaluation with intent alignment
R4Five authorization decisions
R5Tamper-evident receipts
R6Identity binding
R7Semantic distance tracking
R8Telemetry export
R9Least privilege enforcement

Platform capabilities

  • PolicyInterceptor intercepts every tool call before execution across all five framework adapters — no bypass paths
  • ExecutionContext accumulates tool calls, outputs, spend, and delegation chain across a session for cumulative-behaviour policies (rate limits, budget caps)
  • Cedar policy backend evaluating tool name, parameters, agent role, and session intent together
  • Five governance decisions: ALLOW, DENY, MODIFY (pre-execution parameter rewrite), STEP_UP (human approval), DEFER
  • Merkle-chained, offline-verifiable audit records (SHA-256 per-entry hash chain)
  • Ed25519 did:mesh identity per agent with single-use-nonce IATP handshake; TEE keystore + liveness attestation for advanced deployments
  • PromptDefense evaluator: prompt-injection, semantic-drift, and goal-misgeneralisation detection pre-policy (OWASP LLM01 / ASI-002)
  • OpenTelemetry decision export with pluggable sinks (OTLP, CloudEvents, Merkle-chain) and an audit-overflow-denies circuit breaker
  • MCP Security Gateway: every MCP tool call governed with ephemeral, least-privilege credentials scoped per invocation

Architecture

This review was conducted by the AARM Conformance Agent and completed on June 14, 2026. The Agent Governance Toolkit satisfies all nine AARM requirements (R1–R6 core and R7–R9 extended), qualifying for AARM Extended. Interception (R1): Every tool call is intercepted by the PolicyInterceptor before execution; all five framework adapters route through it and bypass paths are forbidden by spec. Context (R2): An ExecutionContext accumulates tool calls, outputs, spend, and the delegation chain across a session, and policy rules use it for cumulative-behaviour enforcement such as rate limits and budget caps. Policy & intent alignment (R3): A Cedar backend evaluates the tool name, parameters, agent role, and session intent together, and the PromptDefense evaluator detects intent-action semantic drift. Decisions (R4): Exactly five decisions — ALLOW, DENY, MODIFY, STEP_UP, DEFER. MODIFY rewrites parameters before execution; STEP_UP halts for human approval. Receipts (R5): Audit records are Merkle-chained — each entry hashes its predecessor (SHA-256) — and are offline-verifiable. Identity (R6): Each agent holds an Ed25519 did:mesh identity whose private key never leaves the agent process; the IATP handshake uses a single-use nonce under a 200ms SLO, with a TEE keystore and liveness attestation for advanced deployments. Drift tracking (R7): The PromptDefense evaluator runs as a pre-policy layer, flagging prompt injection, semantic drift, and goal misgeneralisation for DENY/STEP_UP before policy evaluation (aligned to OWASP LLM01 and ASI-002). Telemetry export (R8): Every decision is exported as an OpenTelemetry log event through pluggable sinks (OTLP, CloudEvents, Merkle-chain), with a circuit breaker that denies on audit overflow. Least-privilege (R9): The MCP Security Gateway governs every MCP tool call and issues an ephemeral, minimum-privilege credential scoped per invocation, with no ungoverned paths.

Key facts

LicenseOpen source
AdoptersMicrosoft, Dayos, Provedit
ConformanceAARM Extended (R1–R9)
VerifiedJune 14, 2026

Maintained by the Agent Governance Toolkit (Microsoft) team. Conformance verified by the AARM working group.

Agent Governance Toolkit (Microsoft) — AARM Builder