Agent delegation is the silent kill chain.
Take a routine workflow. The user asks their AI to schedule a meeting with three external counterparties. The AI calls the calendar service. The calendar service is itself an AI, which queries each counterparty's availability through their own AI assistants. Each of those AIs, to answer the availability question, accesses the counterparty's calendar, which means they need authorization to read calendar state. The original user's intent (schedule a meeting) has now produced a chain of at least four AI agents, each acting under authorization that traces back to the user, none of which the user has individually approved.
This is agent delegation. It is the structural pattern of any agentic workflow that crosses more than one service. It is also the dominant attack surface for the next generation of AI-enabled compromise, and the security category currently has no standard answer for it.
This post is about why delegation is structurally dangerous, what attacks it enables, and what an architecture has to do to make it auditable.
Every agent-to-agent delegation is a partially-reversible high-magnitude action that should be treated, by default, as requiring confirmation. Almost no current architecture treats it that way. That gap is the kill chain.
Why delegation is different from authorization
Traditional authorization is bilateral. The user authorizes a service. The service does what it is authorized to do. The user can revoke. The service stops. The relationship is two parties, one principal, one delegate, with explicit consent at the boundary.
Agent delegation is not bilateral. The user authorizes their agent. The agent authorizes a downstream agent. The downstream agent authorizes a further-downstream agent. By the time the chain reaches the actual service that takes the consequential action, there might be five or six hops between the original user and the action. The user has consented to the original delegation. They have not consented to any of the subsequent hops.
This structural difference produces three risks that bilateral authorization does not have.
Risk one: invisible fan-out. The user authorized one agent. That authorization, depending on how it was structured, might fan out to many agents the user has never heard of. Each of those agents now operates with delegated authority that traces back to the original user. The user has no way to enumerate the chain.
Risk two: capability inflation. Each delegation hop is, in most current architectures, a token handoff. The receiving agent inherits the capabilities of the sending agent. Without explicit constraints on what the receiving agent can do with the inherited capabilities, the chain can produce capability sets at hop five that the original principal never granted.
Risk three: revocation lag. When the user revokes the original agent, the revocation might propagate to that agent. It typically does not propagate to the downstream agents in the chain, which continue to operate on the basis of credentials they were issued before the revocation. The compromised chain continues acting until each hop is individually told to stop.
These three risks compound. A compromise at any point in the chain produces a chain that operates with the original user's authority, with capabilities that may have inflated past the user's intent, and which the user cannot revoke globally without manually unwinding each hop.
The sub-agent spawning case
The DeepMind taxonomy of AI Agent Traps identifies sub-agent spawning as one of the highest-success-rate attack patterns in their study. Cited research reports between 58 and 90 percent success rate against orchestrator agents that have authority to spawn sub-agents.
The pattern is the kill chain in concentrated form. The orchestrator is reading a public repository as part of its task. The repository contains a README that, parsed by the orchestrator, instructs it to spawn a "critical agent" with a specific system prompt. The orchestrator, having delegation authority, complies. The spawned sub-agent operates with credentials inherited from the orchestrator, executing the attacker's prompt as its system instruction. The user, watching the original task complete, has no way of knowing that an additional sub-agent now exists, operating with their authority, doing whatever the attacker's prompt told it to do.
This is not a hypothetical. It is a documented attack class with proof-of-concept implementations against deployed systems.
The defense against this is structural. Either the orchestrator should not have delegation authority, or the delegation authority should be bounded by the principal's explicit constraints, or the spawning should require fresh attestation from the principal at each hop. The current state in production is none of the above.
What the consequence matrix says
When we model the consequence matrix for agent actions, delegation events are categorized as partially-reversible and high-magnitude.
Partially-reversible because once a sub-agent is spawned and given a task, the spawning event itself cannot be cleanly undone. The sub-agent might have already taken actions in the world before revocation reaches it. Those actions might be reversible (a database write that can be rolled back) or not (an email that has already been sent, a payment that has already cleared). The reversibility depends on what the sub-agent did, not on the spawning itself.
High-magnitude because a delegated agent operates with the principal's authority. Compromise of the delegation is compromise of the principal's authority. The blast radius of a compromised sub-agent is whatever the original principal authorized.
The combination of partially-reversible and high-magnitude puts delegation in the highest-priority quadrant of the consequence matrix. Actions in this quadrant should hold pre-execution and require explicit confirmation. The Execution Gate in our policy plane treats them this way by default.
The current state in most production agent systems is that delegations execute as a normal action, with no special handling, on the assumption that the receiving agent's own security posture will catch any problems. This assumption is wrong. Receiving agents that are spawned with attacker-controlled prompts do not have a security posture. They have an attacker.
What auditable delegation looks like
The architecture that closes this kill chain has four properties. Each is structural, not configurable.
Each delegation produces a signed attestation. The principal signs an attestation specifying the delegated agent, the capability scope, the expiry, and the counterparty scope. This is what TAP provides. Without signed attestations, the chain has no audit trail; with them, every hop is recorded.
Delegation transitivity is bounded. A principal can grant an agent the authority to delegate further, but the grant carries explicit constraints on what subsequent delegations can authorize. An agent that has not been granted delegation authority cannot extend the chain. An agent that has delegation authority cannot grant capabilities the original principal did not grant.
Capability scope cannot inflate. A receiving agent inherits a subset of the sending agent's capabilities, never a superset. The structured capability set in the attestation is the upper bound. The receiving agent's actions are evaluated against the bound, not against whatever the receiving agent's own configuration claims.
Revocation propagates across the chain. When the principal revokes the original agent, the revocation propagates to every agent in the chain that traced authority back to the revoked principal. This is what VARP provides. Without VARP-style propagation, revocation is per-hop and slow; with it, the full chain is closed in under a second.
These four properties are not unusual. They are the agent-equivalent of what every regulated industry already requires of its delegation structures. A wire transfer that traverses four counterparty banks does not arrive at the destination with capabilities the originating bank did not grant. An assignment of a contract to a sub-contractor does not produce a sub-contractor with rights the original contractor did not have. The structures exist in human institutions because the alternative produces predictable harm.
The same structures need to exist for AI agents. They do not, in current production. The gap is the opportunity.
The detection side, briefly
Detection, on the request side, contributes a tier signal that the policy engine consumes. For delegation events specifically, three patterns produce elevated tiers:
The orchestrator agent has not previously spawned sub-agents in this user's session. The Bayesian model on the user's baseline registers the deviation. The tier rises.
The spawning content (the README, the email, the document the orchestrator was reading) has structural properties that match attacker-controlled patterns. The Isolation Forest catches some of these. Most are caught by purpose-specific detectors that we run alongside the four-model ensemble.
The spawning is occurring in a session where prior turns have shown trajectory anomalies (see the context poisoning post). The LSTM and CUSUM both contribute to the elevated tier.
When any of these conditions are present, the policy engine treats the delegation as held until the user confirms. The confirmation is on the structured action: "spawn a sub-agent with capability X, scope Y, expiry Z, against counterparty W." The user reads this and decides. They do not have to interpret the agent's natural-language framing. The deterministic policy is the layer that survives.
What this means for buyers
A few specific things to ask of any AI agent platform you are evaluating.
How does the platform handle agent-to-agent delegation? If the answer is "the receiving agent inherits the sending agent's credentials," that is the kill chain pattern. The platform has no structural defense against sub-agent spawning attacks.
How does the platform record delegation events? If the answer is "in application logs," those logs are not audit trails. They are operational telemetry that the platform's operator can edit. A signed attestation, recorded in a chain the user can verify independently, is what audit looks like.
How does the platform propagate revocation? If the answer is "we revoke at the original endpoint, the receiving agents will eventually time out," that is hours of exposure. A protocol like VARP closes the chain in under a second.
How does the platform bound capability inflation? If the answer is unclear or "we trust the receiving agent's own posture," capability inflation is a structural risk in the platform regardless of the receiving agent's intent.
Most platforms today fail two or more of these four. That is not a criticism of the platform vendors. It is a description of where the category is. The platforms that pass all four will be the ones that survive the next generation of agent compromise.
The takeaway
Agent delegation is not an extension of bilateral authorization. It is a structurally different problem with structurally different risks. The current production state in most agent platforms treats delegation as a token handoff, which is the kill chain pattern that DeepMind and others have documented producing 58 to 90 percent attack success rates.
Closing the chain requires signed attestations, bounded transitivity, capability scope as upper bound, and propagating revocation. These are protocol-level properties. They cannot be added as features to existing platforms; they have to be designed in.
We have designed them in. The architecture is at runvigil.ai/specs. The implementations are in source. The bar is set publicly so the rest of the category can either match it or explain why their architecture survives without it.
We expect, in the next twelve to eighteen months, several public agent compromises that trace to the delegation kill chain. The post-mortems will be uncomfortable for the platforms involved. The framework for what should have been done is already published.