Your Logs Pass the Audit. Survive the Examiner.

On May 1, 2026, six national cybersecurity agencies across five countries, led by CISA and the NSA, published Careful Adoption of Agentic AI Services, their first joint guidance on agentic AI. The guidance names five categories of risk and classifies accountability as one of them, describing how opaque agent decisions and fragmented logs make it difficult to trace which component caused an erroneous outcome. Its recommended response to that risk is observability, comprehensive logging and unified audit logs across inter-agent interactions. The recurring pattern across regulated AI deployments is that the systems of record receiving those agent actions, platforms such as Salesforce, ServiceNow, and SAP, frequently cannot distinguish an agent write from a human one, express authorization at the level of a business operation, or reconstruct the business state and approval condition at the moment of a write. Logging records that an action occurred. It does not establish that a named organizational decision authorized the specific business state transition, and no audit layer added on top can manufacture an authorization the underlying system was never built to express.

The governance team lead printed the guidance on a Friday afternoon. Twenty-nine pages from six national cyber agencies. She read the accountability section, then opened the Purview portal and pulled up the agent activity logs for the three Copilot Studio workflows her team had deployed since February. The logs were clean. Every interaction captured. Timestamps, actor IDs, tool calls, policy matches.

She marked it reviewed. The question the document had raised, she thought, was answered.

Logging assumes the substrate already knows who acted. That assumption is the gap the guidance did not close.

The guidance tells regulated enterprises to log agent actions comprehensively. The question it does not ask is whether the system of record receiving those actions can produce accountability before any log captures it.

On May 1, 2026, six national cyber agencies published joint guidance on the secure adoption of agentic AI. Under accountability risks, the document is candid: agent decisions are opaque, attribution fragments across logs, reasoning chains resist reconstruction. The remedy its best practices reach for is observability: comprehensive logging and unified audit logs across every inter-agent interaction. The document even concedes the limit, that long reasoning chains and contextual data produce logs too large and too loosely structured to yield meaningful signal.

That prescription is reasonable, as far as it goes. It does not go far enough.

Logging assumes the substrate, the system of record where the action lands, already knows who acted. It assumes the audit trail can reconstruct what business condition authorized the change and whether a named organizational decision covered that specific state transition. Most enterprise systems of record make none of those guarantees. A claims workflow in a healthcare system logs every status change. The audit trail does not distinguish a claims processor who reviewed the case from a clinical documentation agent that inferred a disposition from summarized notes. The field value changed. Whether an authorized human reviewed the clinical condition that made the change valid is invisible to the log. You can add Purview on top, add a second log layer, add interpretability tooling, and the substrate problem remains. Logging records what happened. It cannot manufacture authorization the system was never built to express.

When an examiner arrives, the question is not whether your logs are comprehensive. The question is whether your logs can produce accountability. Those are different requests.

Logging captures attribution. Accountability requires attribution, authorization, and business-state justification. A comprehensive log tells the examiner the agent acted. An accountability record tells the examiner who authorized the specific business state transition, what condition made it valid, and whose name is attached to that decision before the action ran. The guidance conflates the two. The Purview portal shows the interaction. It does not show the organizational decision that sanctioned the write.

This is the accountability assumption at its clearest. Every governance structure built for human actors rests on the belief that somewhere, a human decided, even if implicitly. When an agent inherits permissions and acts autonomously, that belief becomes structurally false. The action is real. The human decision behind it may never have existed. No volume of logging establishes a decision that was never made.

The guidance strengthens observability. It does not establish accountability where the underlying system cannot express it.

The guidance assumed the system already knew who was accountable. Most systems of record do not.

The Agent Substrate Readiness Model separates two questions the guidance treats as one. Can the substrate support agent action technically. And has the organization authorized agents to use that substrate in a way that produces accountability, not just activity.

Tier Two of the model gives five tests. Three tests expose the gap in under an hour.

The first: does this system distinguish an agent write from a human write in its own audit trail, without a Purview overlay on top. A ServiceNow incident log that records "updated by service account" cannot answer an examiner's question about whether a human or an autonomous agent closed that incident. The substrate either produces agent-attributed writes or it does not.

The second: does this system express authorization at the level of a business operation, not access to a table. Salesforce object-level access tells the agent it can reach a record. It does not tell the agent, or the organization, that closing this deal requires documented approval first. The substrate question is whether the system can enforce that condition, not just label the role.

The third: can this system reconstruct the full business state at the moment of a write. Not what changed. What the record state was, what condition existed at the time, and whether any human approval was part of the workflow. SAP records payment postings. The forensic question after an incorrect posting is whether the system can show the authorization chain that validated it. Most cannot.

If the substrate fails any of the three, the logging layer the guidance recommends becomes evidence of activity, not evidence of accountability. Those are different exhibits in a regulatory examination.

The full Tier Two diagnostic, all five substrate tests and system-by-system profiles for Jira, Salesforce, ServiceNow, SAP, Workday, and the Microsoft stack, is here.

For every agent your organization has running right now, which of the systems it writes to can pass the first test without you configuring anything new. That number is your accountability gap. The examiner will calculate it regardless.