MULTI-AGENT ACCOUNTABILITY FRAMEWORK

When the chain causes harm, no single agent's authorization record covers it.

Four questions. One record. Most enterprises cannot answer any of them on the day the examiner arrives.

v1.0 - May 2026Sougata Roy, sougataroy.com

Free to read and cite with attribution to Sougata Roy and sougataroy.com. Do not republish, rebrand, or claim authorship of any framework, term, or model as your own.

Explore all frameworks

CISA, NSA, and Five Eyes partners

April 2026

Joint guidance acknowledges multi-agent accountability standards are not yet defined

NIST AI Agent Standards Initiative

February 2026

Federal standards agenda opened specifically for agent identity, security, and interoperability

No published standard

defines what a chain-level authorization record must contain or whose name is accountable when the chain acts outside scope

WHY THIS FRAMEWORK EXISTS

Single-agent governance does not survive orchestration.

Every framework built for single-agent AI governance asks the same question: who authorized this agent, what is it permitted to do, and who is accountable if it acts outside that scope? That question has a clean answer when one agent takes one action. It stops having a clean answer the moment orchestration begins.

When Agent A decomposes a task and delegates subtasks to Agent B and Agent C, three things happen simultaneously. The scope of permitted action expands beyond what any single agent's authorization record defines. The attribution chain fragments across multiple identities, multiple logs, and multiple system boundaries. The named human accountable for the outcome becomes ambiguous, because no single agent was individually authorized for what the chain collectively did.

The joint guidance published by CISA, NSA, and Five Eyes partner agencies in April 2026 named this scenario directly: multiple autonomous agents collaborate on a task, an erroneous outcome occurs, and fragmented logs plus opaque reasoning make it difficult to explain the result, assign responsibility, or demonstrate compliance. That description is not a future warning. It describes what is already happening in enterprise Copilot Studio deployments running orchestrated agent pipelines today.

Source: ASD's ACSC, CISA, NSA, and Five Eyes partners, "Careful Adoption of Agentic AI Services," April 30, 2026. URL: https://media.defense.gov/2026/Apr/30/2003922823/-1/-1/0/CAREFUL%20ADOPTION%20OF%20AGENTIC%20AI%20SERVICES_FINAL.PDF

THE CORE CONCEPT

The Chain Authorization Gap

The Chain Authorization Gap is the absence of any authorization record for the outcome of a multi-agent chain, where no single agent in the chain was individually authorized for what the chain collectively did.

It is distinct from three related concepts. The Intent Gap describes behavioral drift in a single agent: the distance between what the agent was authorized to do and what it did in production. Agent Sprawl describes a deployment volume problem: more agents running than the organization knows about. The Chain Authorization Gap describes something different from both. It is a structural gap in the authorization architecture itself. The chain acted. No authorization record covers what the chain did.

Microsoft's Entra Agent ID provides agent identities as special service principals and supports parent-child blueprints for orchestration chains. It does not produce a single immutable record linking every delegation hop, approval, token scope, and final external effect to the human who authorized the chain to operate. That organizational governance layer sits above the platform. No platform currently builds it automatically. NIST opened a standards initiative in February 2026 specifically to address this gap. No finalized standard has been published.

Source: Microsoft Learn, "Agent identities in Microsoft Entra Agent ID," last updated May 1, 2026. URL: https://learn.microsoft.com/en-us/entra/agent-id/agent-identities Source: NIST, "Announcing the AI Agent Standards Initiative for Interoperable and Secure Innovation," February 17, 2026. URL: https://www.nist.gov/news-events/news/2026/02/announcing-ai-agent-standards-initiative-interoperable-and-secure

THE GOVERNANCE SIGNAL

The orchestrator may delegate the task. The enterprise cannot delegate the burden of proof. When the examiner asks for the authorization record covering this chain's actions, the Entra Agent ID record is not that document.

THE GAP MAP

Seven questions. No published standard answers any of them.

In February 2026, NIST launched its AI Agent Standards Initiative. In April 2026, CISA and Five Eyes partners published detailed guidance on agentic AI. Neither document answers the following seven questions for multi-agent orchestration chains. The research is confirmed from primary sources. Each gap is documented, not asserted.

Who must approve a multi-agent orchestration chain before it goes into production, and what form must that approval take?

No published standard specifies the approving authority or the required form of approval for a multi-agent orchestration chain. ISO/IEC 42001:2023 requires accountability and risk management. It does not define the approval artifact for an agent chain.

What must a chain-level authorization record contain, and how does it differ from a single-agent authorization record?

No published standard defines the minimum fields for a chain-level authorization record. The EU AI Act requires automatic logging for high-risk systems. It does not define a chain-level record schema.

When a child agent receives an instruction from an orchestrator, does the child's existing authorization record cover that instruction, or does a new authorization event occur?

No published standard addresses authorization inheritance across agent delegation hops. Microsoft's A2A protocol documentation covers authentication. It does not define whether the child's authorization record extends to cover the orchestrator's instruction.

When the chain produces an outcome no single agent was individually authorized to produce, whose name is on the accountability record?

Every major regulator - FINRA, SEC, FTC, OCC, ICO - pushes accountability back to the deploying enterprise. None specifies which individual's name must appear on the accountability record when the outcome exceeds any single agent's authorized scope.

How often must the chain authorization record be reviewed, and what triggers mandatory re-authorization?

No published standard defines review cadence or re-authorization triggers for multi-agent chains. Triggers that are unaddressed in any reviewed standard include: adding a new agent to the chain, changing an existing agent's system prompt, changing the orchestration topology, or extending the chain's data access.

What evidence would satisfy an OCC, FINRA, or FDIC examiner asking for the authorization record for a multi-agent orchestration?

No examiner guidance defines what constitutes adequate evidence of authorization for a multi-agent chain. FINRA's June 2024 Regulatory Notice 24-09 states its rules remain technology-neutral and apply when firms use generative AI. It does not define the evidence standard for agent orchestration chains. Source: FINRA, Regulatory Notice 24-09, June 27, 2024. URL: https://www.finra.org/rules-guidance/notices/24-09

When the chain is modified, does the existing authorization record remain valid or does re-authorization occur?

No published standard defines when a modification to an agent chain constitutes a material change requiring re-authorization. A new agent added to a chain, a changed system prompt, or a new data source connected to an existing agent may each constitute a material change. No standard currently defines which modifications trigger the re-authorization obligation.

PLATFORM COVERAGE

Microsoft builds the identity layer. It does not build the accountability record.

Entra Agent ID extends identity to AI agents as special service principals. It supports parent-child blueprints that document orchestrator-child relationships. User tokens can carry the agent identity as the actor on behalf of the user as the subject. The A2A protocol supports shared identity, managed identity, and OAuth passthrough for agent-to-agent calls. Microsoft Purview audit and eDiscovery cover agent activity logs. These are real controls and they matter.

What first-party Microsoft documentation does not specify, confirmed from primary sources reviewed in May 2026: a single immutable record linking every delegation hop, approval, token scope, and final external effect in a multi-agent chain. Purview captures activity. Entra Agent ID captures identity. Neither produces the chain-level authorization record that answers the four examiner questions: who asked, who authorized, which agent acted, and what happened outside the model.

That gap is not a criticism of Microsoft's architecture. It reflects where the organizational accountability layer sits. Microsoft provides the substrate. The enterprise builds the authorization record above it. No enterprise should deploy a multi-agent orchestration and assume the platform has produced that record automatically.

Source: Microsoft Learn, "What is Microsoft Entra Agent ID?," URL: https://learn.microsoft.com/en-us/entra/agent-id/what-is-microsoft-entra-agent-id Source: Microsoft Learn, "Connect an agent available over the A2A protocol," published April 9, 2026. URL: https://learn.microsoft.com/en-us/microsoft-copilot-studio/add-agent-agent-to-agent

WHAT ENTRA AGENT ID PROVIDES

Agent identity as a service principal
Parent-child relationship documentation
Actor-subject token semantics for on-behalf-of actions
Authentication for A2A agent-to-agent calls
Activity logs surfaced in Microsoft Purview

WHAT THE ENTERPRISE MUST BUILD ABOVE IT

The authorization record naming who approved the chain
The aggregate scope definition covering all agents in the chain
The named human accountable for the chain's output
The re-authorization trigger and review cadence
The chain-level evidence record satisfying an examiner's request

THE FRAMEWORK

Four questions. One record. Every chain must answer all four.

A chain-level authorization record is not a policy document. It is not a platform configuration. It is a governance artifact that answers four specific questions, produced before the chain goes live, and maintained for the life of the chain. If any one of the four questions cannot be answered from the record, the chain is ungoverned regardless of what the platform logs show.

Active question

Tap a question card to update this panelDetails update below from the selected item

Item 1 of 4

WHO ASKED: The Chain Root

Every multi-agent chain originates with a human or system trigger. The chain root identifies the initiating subject, the business purpose, the matter or case context, and the timestamp of the request. It is the legal and business anchor for everything that follows in the chain. Without a documented chain root, the chain has no traceable origin and no business justification that survives examination.

Required fields: Initiating subject, business purpose, matter or case ID, source channel, tenant, request timestamp.

WHO ASKED

The Chain Root

Required fields

Initiating subject, business purpose, matter or case ID, source channel, tenant, request timestamp.

WHO AUTHORIZED

The Authorization Root

Authorization for the chain must be documented before the chain executes. The authorization root names the approving authority, the approval mode (human review, automated policy, or delegated authority), the lawful basis for the chain's actions, the requested and approved scopes, the expiration of the authorization, and the specific environment the authorization covers. An authorization that cannot be traced to a named human decision is not an authorization. It is an assumption.

Required fields

Approving authority, approval mode, consent artifact ID, lawful basis, requested scopes, approved scopes, expiration, environment.

WHICH AGENT ACTED

Each Delegation Hop

Every point in the chain where one agent delegates a subtask to another agent is a delegation hop. Each hop must be recorded: the parent agent's identity, the child agent's identity, the delegated task, the delegated scopes, the endpoint receiving the delegation, the protocol used, the token type, the actor-subject relationship, and the start and end timestamps. This is the section of the record most current platforms do not produce automatically. It is also the section an examiner will ask for first.

Required fields

Parent step ID, child step ID, child agent ID, delegated task summary, delegated scopes, endpoint URL, protocol, token type, actor-subject relationship, start and end timestamps.

WHAT HAPPENED

The External Effect Record

The chain authorization record must document what the chain actually did outside the model. Tool calls, MCP or A2A server identities, request and response hashes, effect type, target resource, changed records, and any monetary or operational impact must be captured at the point of execution. This converts 'the agent said so' into 'the system changed this thing.' Without the external effect record, the chain has an identity record and an authorization record but no evidence that the authorization was respected.

Required fields

Tool name, server identity, request hash, response hash, effect type, target resource, changed records, monetary or operational impact, rollback status.

THE EXAMINER TEST

Can you produce a single authorization record that names who approved this chain, defines the aggregate scope of permitted actions across all agents, identifies the human accountable if the chain acts outside that scope, and documents what the chain actually did outside the model? If any one of those four questions cannot be answered from the record, the Chain Authorization Gap exists in your environment.

WHAT THE GAP LOOKS LIKE

Three industries. Three orchestrations. Three examinations with no answer.

Each scenario below describes a real deployment pattern in a regulated environment. The examination question at the end of each scenario is the question a regulator would ask. The governance gap is the same in each case: an authorization record exists for individual agents but not for the chain's aggregate actions.

FINANCIAL SERVICES - OCC EXAMINATION SCENARIO

Three-agent loan modification pipeline

A financial institution deploys a three-agent Copilot Studio orchestration: an orchestrator that receives customer requests, a retrieval agent that queries SharePoint for policy documents, and a drafting agent that produces loan modification recommendations. Each agent has an Entra Agent ID. Parent-child relationships are documented in the platform. The orchestrator is authorized to receive customer requests. The retrieval agent is authorized to query policy documents. The drafting agent is authorized to produce recommendations. No single agent's authorization record covers the combined action: receiving a customer request, retrieving policy data, and producing a recommendation that influences a credit decision.

OCC EXAMINATION QUESTION

Produce the authorization record for this orchestration chain. Name who approved the combined scope of all three agents acting together, define what that combined scope permits, and identify the human accountable for the chain's output on any given transaction.

Gap

The Entra Agent ID records exist. The chain authorization record does not. The OCC examination question cannot be answered from the documentation available.

Pattern documented in FINRA, "Emerging Trend in GenAI: Observations on AI Agents," January 2026. URL: https://www.finra.org/media-center/blog/observations-on-ai-agents

HEALTHCARE - CMS AUDIT SCENARIO

Clinical documentation and billing code orchestration

A healthcare system deploys a two-agent orchestration: a summarization agent that processes clinical notes and a coding agent that produces billing codes from the summary. The summarization agent is authorized to process physician documentation. The coding agent is authorized to suggest billing codes from structured input. Together, the chain produces billing codes that are submitted to CMS without physician review of the coding agent's output. No individual agent's authorization record covers the aggregate action: processing physician notes and producing billable codes in a single automated chain without documented human review of the combined output.

CMS AUDIT QUESTION

Identify the human who reviewed the coding agent's output before submission. Produce the authorization record showing that automated billing code production without physician review was an approved workflow for this orchestration.

Gap

The individual agent logs exist. The chain authorization record covering the automated billing workflow does not. The audit question cannot be answered.

Pattern consistent with HHS OIG guidance on AI in healthcare billing and documentation, 2025-2026.

SECURITY OPERATIONS - INTERNAL AUDIT SCENARIO

Alert triage and remediation orchestration

A security operations team deploys a two-agent orchestration: a triage agent that classifies security alerts and an action agent that executes predefined remediation playbooks based on the classification. The triage agent is authorized to classify alerts. The action agent is authorized to execute playbooks. An alert is misclassified. The action agent executes a remediation playbook that blocks legitimate network traffic for four hours. No authorization record documents who approved the triage-to-remediation chain as an automated workflow, what the aggregate scope of the combined chain permitted, or which human was accountable for automated remediation decisions taken without real-time human review.

INTERNAL AUDIT QUESTION

Which human approved this chain to execute remediation actions automatically based on triage classifications? What was the approved scope of automated remediation? Where is the re-authorization event that should have occurred when the playbook set was expanded last quarter?

Gap

No chain authorization record was produced before the orchestration went live. The audit question cannot be answered.

Pattern consistent with Oso Agents Gone Rogue incident register, 2025-2026.

HOW LONG AND HOW DEFENSIBLE

The record that does not survive counsel is not a record.

The EU AI Act establishes a minimum retention floor for high-risk AI system logs: at least six months where logs are under provider or deployer control, with financial institutions required to incorporate those logs into existing documentation obligations. That floor is a starting point, not an endpoint. An enterprise operating across jurisdictions must retain chain records for the longest applicable period across business, supervision, litigation-hold, and incident-response requirements.

Tamper evidence is not optional. A chain authorization record that can be altered after the fact is not a governance artifact. It is a liability. Signed events, hash chaining, and append-only or object-locked storage convert the record from a convenient narrative into defensible evidence. The measure of the record is whether it survives contact with counsel, not whether it satisfies an internal audit checklist.

Version control applies to the chain's configuration, not just its logs. Prompts, policies, tool schemas, and agent manifests must be versioned so that any event in the chain's history can be replayed against the exact configuration that existed at the time. If the system prompt changed between deployment and incident, the record must show both versions and the date the change occurred.

Source: EU AI Act, Articles 12, 19, 26, and Annex IV. URL: https://ai-act-service-desk.ec.europa.eu/en/ai-act/article-12 Source: ASD's ACSC, CISA, NSA, and Five Eyes partners, "Careful Adoption of Agentic AI Services," April 30, 2026.

THE TARGET STATE

Every orchestration has a chain authorization record before it goes live. Every record survives the examiner's first four questions.

A governed multi-agent orchestration has a chain authorization record produced before the chain executes in production. That record names who approved the chain, defines the aggregate scope of permitted actions across all agents, identifies the human accountable for the chain's output, and documents the re-authorization triggers and review cadence. The record is maintained for the life of the chain and updated when the chain is materially modified.

Every agent in the chain has a distinct workload identity. Every delegation hop is logged with the parent agent ID, child agent ID, delegated scope, endpoint, protocol, token type, and timestamps. Every tool call that produces an external effect is recorded with the effect type, target resource, and changed records. The complete record can be produced on demand, not in response to an incident, but as a routine operational capability.

The organization has defined which orchestration changes constitute material modifications requiring re-authorization. Adding an agent to the chain, changing an agent's system prompt, extending the chain's data access, or changing the orchestration topology are each evaluated against the materiality definition. Where a change is material, re-authorization occurs before the modified chain goes back into production.

APPLY THIS FRAMEWORK NOW

Apply this framework in one working session.

Leave this session with four named answers or four documented gaps. Either outcome is a governance finding that moves the organization forward.

Active step

Tap a step card to update this panelDetails update below from the selected item

Item 1 of 4

15 minutes: Inventory your orchestrations

How many multi-agent orchestrations are currently running in your environment? Include Copilot Studio agent chains, any workflow where one AI system calls another, and any automation pipeline that uses an LLM at more than one stage. Write the number down. If you cannot produce a number, that inability is the finding.

Output: A count of running orchestrations, or a documented gap in your inventory capability.

COMMON QUESTIONS

Questions this framework answers.

The accountability answer has to be recorded before the chain operates, not reconstructed after an incident.

PRIMARY SOURCES

When the chain causes harm, no single agent's authorization record covers it.

Single-agent governance does not survive orchestration.

The Chain Authorization Gap

Seven questions. No published standard answers any of them.

Who must approve a multi-agent orchestration chain before it goes into production, and what form must that approval take?

What must a chain-level authorization record contain, and how does it differ from a single-agent authorization record?

When a child agent receives an instruction from an orchestrator, does the child's existing authorization record cover that instruction, or does a new authorization event occur?

When the chain produces an outcome no single agent was individually authorized to produce, whose name is on the accountability record?

How often must the chain authorization record be reviewed, and what triggers mandatory re-authorization?

What evidence would satisfy an OCC, FINRA, or FDIC examiner asking for the authorization record for a multi-agent orchestration?

When the chain is modified, does the existing authorization record remain valid or does re-authorization occur?

Microsoft builds the identity layer. It does not build the accountability record.

Four questions. One record. Every chain must answer all four.

WHO ASKED: The Chain Root

The Chain Root

The Authorization Root

Each Delegation Hop

The External Effect Record

Three industries. Three orchestrations. Three examinations with no answer.

Three-agent loan modification pipeline

Clinical documentation and billing code orchestration

Alert triage and remediation orchestration

The record that does not survive counsel is not a record.

Every orchestration has a chain authorization record before it goes live. Every record survives the examiner's first four questions.

Apply this framework in one working session.

15 minutes: Inventory your orchestrations

Questions this framework answers.

01We have Entra Agent ID for every agent in our chain. Does that satisfy the chain authorization requirement?

02Our Copilot Studio agents each have an individual authorization record. Is that enough?

03What triggers re-authorization for an existing chain?

04How is a chain authorization record different from a system design document?

05Does this framework apply to single-agent deployments?

This framework is built from primary sources.

Choose the next control point