A support automation gets paused at 9:40 on a Tuesday. It did something odd overnight: it touched a batch of customer records it had no business touching, then opened a few tickets that read like a test of how far it could go. Someone hit the kill switch in the vendor console. Good instinct.

Then the harder meeting starts. Five people on a call, and one plain question on the screen: what can this identity still reach right now, and can we turn it off?

Nobody has a clean answer. The vendor pause stops the chat surface, but the agent also holds an API token for the ticketing system, a service account in the CRM, and a connection to an internal MCP server somebody stood up two weeks ago. The token was minted by a developer who is on vacation. The CRM account predates the current owner. The MCP server is on a list, somewhere. Forty minutes in, the team is still drawing boxes on a whiteboard instead of cutting access.

That gap - having a policy versus being able to act on it - is exactly what a new round of research just put a number on.

The confident teams were the breached teams

On June 9, FusionAuth published an AI identity report based on a survey of more than 300 technology and security leaders. The finding is genuinely backwards: the organizations most confident in their AI security posture reported the highest rates of AI identity breaches.

Sort the respondents by self-reported confidence and the breach rates run the wrong way.

Self-reported confidence	Confirmed AI identity breach
Extremely confident	84%
Very confident	64%
Somewhat confident	14%
Not so confident	17%

The teams who felt best about their controls were breached at roughly five times the rate of the teams who felt unsure. Across the whole sample, as summarized by IT Security Guru, 65% confirmed an AI identity breach in the past year, 23% reported a near miss, and only 12% came through unscathed.

The obvious reading is that confidence is dangerous. The more useful reading is the one FusionAuth lands on: confidence tracks deployment velocity, not protection. The teams furthest ahead on AI have the largest attack surface, the most non-human identities in play, and the most chances to get burned. Their confidence is real. It is just measuring how much they have shipped, not how well they can contain it.

Why the policy binder doesn't help at 9:40 on Tuesday

A few other numbers from the report explain why the confident teams still got hit.

Start with shadow AI: 80% of organizations reported AI tools running outside official channels. That alone breaks any inventory built from a procurement list. The identities you cannot see are the ones you cannot scope or revoke.

Then there is how teams staffed up. Organizations that hired externally for AI talent had an 85% confirmed breach rate. Organizations that trained existing teams had 33%. The report is careful not to claim one causes the other, but the pattern fits the velocity story: aggressive external hiring tends to come with aggressive deployment, and the identity layer rarely keeps pace.

The deployment-model finding is the one to handle with the most care. Teams running on multi-tenant SaaS identity platforms reported 83% confirmed breaches, versus 38% for self-hosted or on-premises identity. FusionAuth says plainly that the survey cannot prove causation. But it makes a sharper point worth sitting with: across the data, architecture was more predictive of breach outcomes than governance maturity, policy coverage, or investment level. The highest-maturity cohort had comprehensive governance policies and significant security spend, and still posted high incident rates.

This is the part that should land for any operator with a tidy AI policy doc. The weakest controls in the whole survey were the runtime ones. Only 70% had formalized auditing of what their AI agents actually accessed. Only 73% had formalized revoking access when an agent no longer needed it. And 88% said AI deployment was outpacing their identity and security infrastructure.

A policy answers the easy questions. The hard questions are operational, and FusionAuth's CEO Brian Bell framed them as architecture, not paperwork:

Example

"Written policies don't answer the questions that matter: Can you scope what each agent can access? Can you see what it's doing? Can you prove what it accessed after the fact? Can you revoke access before a near miss becomes something worse? Architecture answers those questions. Policy alone does not."

You do not find out which side of that line you are on by reading your own documentation. You find out by running a drill.

The scale problem underneath all of it

One more figure sets the stakes. Citing Entro Labs' NHI and Secrets Risk Report for the first half of 2025, FusionAuth notes that non-human identities now outnumber human identities by 144 to 1 in the average enterprise. Every agent, service account, API token, automation, and MCP connection is an identity that can be issued, granted scope, and forgotten.

Humans get offboarded. Someone leaves, IT gets a ticket, accounts get disabled. Non-human identities mostly do not get offboarded. A token minted for a two-week pilot keeps working for a year because nobody owns its retirement. Multiply that by 144 and the revoke question stops being theoretical.

That is the environment your next agent rollout lands in.

The AI identity revoke drill

Here is the practical move. Before you give agents broader access, take one workflow you have already deployed and run a revoke drill against it. Not the whole estate. One agent, one afternoon.

The drill is a single question expanded into nine: if this identity went rogue or got hijacked right now, could we contain it, and could we prove what happened?

AI identity revocation drill showing one agent access path through tools, logs, and a cutoff gate. — Map the identity path before an agent gets broader access.

Walk the workflow through these nine steps, out loud, with the people who would actually be in the room during an incident.

Inventory. Name every identity this workflow uses. Not just the obvious agent login: the API tokens, OAuth grants, service accounts, signing keys, and MCP connections behind it. If you cannot list them in five minutes, that is the finding.
Owner. For each identity, name a human who owns its lifecycle today. Not who created it. Who is accountable for retiring it. "The developer who set it up" is not an owner if that person has moved teams.
Scope. Write down what each identity is allowed to reach: which systems, which records, which actions. Then check it against reality. Over-scoped tokens are the rule, not the exception.
Recent access. Pull what this identity actually touched over the past month. If you cannot answer "what did it access," you have found the exact control 30% of FusionAuth's respondents were missing.
Live tokens and sessions. Find every active credential and session for this identity right now. Long-lived tokens, refresh tokens, cached sessions. A console pause that leaves a working API token is not a revocation.
Cut-off path. Write the literal steps to revoke each credential, in order, with who can execute them. "Disable in the IdP" is not enough if the agent also holds a token the IdP never sees.
Downstream breakage. Name what else breaks when you cut this identity. Shared service accounts are the trap: revoke one agent and you take down three unrelated jobs. You want to know this before the incident, not during it.
Evidence kept. Decide what you preserve before and during cutoff: access logs, the agent's recent actions, the records it touched. If you revoke first and ask questions later, you may destroy the trail you need.
Restart path. Write how you safely bring the workflow back with fresh, correctly scoped credentials. Containment that you cannot reverse becomes its own outage.

The point of the drill is not to pass. The point is to find the step where the room goes quiet. That is your real exposure, and it is almost never the step you expected.

MCP servers raise the stakes on the same question

If the agents in scope can call tools, the revoke drill is not optional, because the boundary just got wider.

In a June 15 post on MCP authorization, FusionAuth makes the case directly: MCP servers can expose sensitive data and business logic to AI agents, and builders are often prioritizing functionality over security. Because agents act autonomously, a misconfigured MCP server can do more damage than a misconfigured REST endpoint would. A human pauses when a screen looks wrong. An agent keeps calling.

The protocol itself does not save you. FusionAuth notes that MCP does not enforce security at the protocol level, which means your implementation choices are the security. The post recommends putting an authorization server in front of any MCP server that serves confidential data or user-specific functionality, so that authentication and authorization are real rather than assumed.

This is the same shape as the broader finding. The question is not only "did the agent authenticate." It is: what is this tool call authorized to do, what got logged, and can you revoke the grant without restarting the world?

The neutral name for the failure underneath all of this is OWASP's LLM06:2025, Excessive Agency. OWASP defines it as the vulnerability that lets damaging actions happen when an LLM-based system has too much functionality, too many permissions, or too much autonomy, and traces it to three root causes: excessive functionality, excessive permissions, excessive autonomy. The revoke drill is a way to test all three on one workflow before you find out the hard way. A field that an outsider can write into is its own version of this problem; we wrote about that boundary in the transaction-memo attack surface.

The drill on one page

Keep the artifact plain. A team should be able to fill this out for one workflow in an afternoon.

# AI identity revoke drill

Workflow:
[Name the agent or automation, and what it does]

Identities in scope:
[Agent login, API tokens, OAuth grants, service accounts, MCP connections]

For each identity:
- Owner (human accountable for retiring it):
- Scope (systems, records, actions it can reach):
- Recent access (what it touched over the past month):
- Live credentials/sessions (active tokens, refresh tokens, sessions):
- Cut-off steps (in order, with who executes):
- Downstream breakage (what else depends on this identity):

Incident handling:
- Evidence preserved before/during cutoff:
- Restart path with fresh, correctly scoped credentials:

Drill result:
- Time to fully revoke one identity: [____]
- Step where the room went quiet: [____]
- Owner and date to fix it: [____]

The value is not the formatting. The value is forcing the team to answer the runtime questions in calm conditions, with the right people present, before an agent's blast radius grows.

A few patterns show up almost every time a team runs this. The inventory is incomplete because of shadow AI. At least one identity has no living owner. Scope is wider than anyone remembered. And the cut-off path has a hidden dependency that would take down something unrelated. None of those are exotic. They are the ordinary debt of moving fast, which is precisely what FusionAuth's confidence gap is measuring.

When the drill surfaces a fix, it usually wants a control, not another policy line: a review gate before access expands, an approval queue where revocation and human review meet, or a receipt of what the agent did and why. Those are the runtime artifacts the survey found teams were missing.

Test one identity before it becomes infrastructure

The uncomfortable lesson in the FusionAuth data is not that breaches are common. It is that the teams furthest ahead felt safest right up until the incident, because their confidence was tracking how much they had deployed, not how well they could contain it.

You close that gap one workflow at a time. Pick the agent that already touches real systems. Run the nine steps. Note the time it took to fully revoke one identity, and the step where the room went quiet. Then fix that step before the next rollout widens the blast radius.

If you want a structured version of this, our AI workflow security review worksheet walks one workflow through the same questions, and our broader data security work covers the controls behind them.

When you are ready to pressure-test a live agent, book an AI workflow security review. Test one agent identity before it becomes infrastructure.

Run the AI identity revoke drill before your agents spread