Threat Research

Why providers cannot build the defense layer.

A note on who is writing this. I am Marek. Before Vigil, I spent twenty years inside the marketing functions of YouTube, Red Bull, Unilever, and PepsiCo. Marketing teams sit close enough to commercial decisions to know how product safety arguments actually get resolved when they conflict with launch deadlines. They almost always lose, on a long enough timeline. Not because anyone is malicious. Because the people who would lose their jobs for missing the deadline have more institutional weight than the people who would lose their jobs after the eventual incident. That asymmetry produces the same outcome in every category I have worked in.

This post is about why that asymmetry will produce the same outcome in AI safety, and why no AI provider, regardless of its intent, can structurally operate the defense layer that audits its own product.

This is not an argument against any provider. It is an argument against a structure.

Three provider conflicts, in plain language

Conflict one: revenue per token

Every major AI provider earns revenue per token of inference. The unit economics of every commercial AI product point in the same direction: more usage, more revenue. A safety layer operated by the provider has, structurally, a tension with its commercial layer. Every alert that interrupts a user's session is, on the margin, a session that produces less revenue. Every capability constraint that requires confirmation is, on the margin, friction that competing providers without the constraint can use as a sales advantage.

The provider can resolve this tension with internal policies, executive commitment, and well-staffed safety teams. Many of them do. The point is not that they fail to try. The point is that the tension is permanent, and on a long enough timeline, the commercial side wins more arguments than the safety side. Every consumer industry of consequence has produced a generation of evidence for this pattern. Tobacco companies funded their own research on smoking-related illness. Pharmaceutical companies ran their own post-market surveillance. Aircraft manufacturers self-investigated their own crashes. In each case, the structure produced a predictable failure mode.

AI is the early stage of the same pattern. The structure is the same. The failure mode will be the same. The only variable is how long it takes.

Conflict two: the audit cannot indict the auditor

When a provider's safety team finds that the provider's product has caused a harm, the safety team's options are limited by the fact that the safety team's salary is paid by the entity producing the harm. There is no public registration body to escalate to. There is no regulator with operational authority to subpoena the underlying logs. There is no independent counterparty in a position to demand cryptographic proof of what occurred.

This is not a problem of integrity. The safety teams I have worked with at large companies are usually staffed by exactly the people you would want there, often at significant pay penalty relative to what they could earn in pure product roles. The problem is that integrity does not survive contact with the institutional gravity of "we cannot publish this finding because it would expose the company to enterprise litigation."

I have watched the conversation play out in three industries. The wording differs. The outcome does not. The finding gets sanitized, then summarized, then mentioned in passing, then archived. The harm continues until something external forces a different outcome.

For AI, the external forcing function does not yet exist. NIST is in the early stage of producing one. The EU AI Act is in the early stage of producing one. Until those structures exist, the audit cannot indict the auditor, because there is nowhere for the indictment to go.

Conflict three: cross-provider visibility is structurally impossible

A user today might run Claude in the morning, Cursor at lunch, ChatGPT in the afternoon, and Gemini in the evening. Their actions across these surfaces compose into a single workflow with a single set of consequences. Compromise in one surface affects the others. Drift in one surface produces poisoned context for the others. Account compromise on one surface authorizes the attacker across the workflow.

No single provider can see this. To see it, they would have to become a proxy, intercepting the other providers' traffic. They will not do this. To do it, they would have to give up their own model's privileged position in the request path, which they will not do. To do it, they would have to operate infrastructure outside their commercial boundary, which they will not do.

This is not a competitive limitation. It is a structural one. The only entity that can see across providers is one that sits outside all of them. There is no path for any single provider to assemble cross-provider visibility without abandoning their commercial position. None of them will. This is correct from their perspective. It is also why the cross-provider defense layer, as a category, cannot be built by them.

What category separation looks like

The financial industry got here first. Issuers cannot audit their own filings. Auditors cannot also be issuers. The two functions are required by structure to sit in different organizations, with different revenue models, different boards, different liability exposures, and different licensing regimes. The structural separation is enforced by regulation, not by gentleman's agreement. The reason it is enforced by regulation is that the gentleman's agreement failed too many times in the previous century.

Pharmaceuticals got here second. Drug companies cannot certify their own clinical trials. The certification function sits outside the company, in the FDA and in independent review boards, with different revenue models, different incentive structures, and different liability regimes. Aircraft manufacturers got here third. Crash investigations sit with the NTSB, not with Boeing.

The pattern is consistent. Industries that handle consequential outputs eventually require that the entity producing the outputs cannot also be the entity certifying them. The transition is forced by an incident, then institutionalized over a decade.

AI is at the start of this transition. The transition will happen. The question is whether the eventual structure will be designed deliberately, in advance of the forcing incident, or hastily, after.

What this means for Vigil's positioning

Vigil exists in this gap. We do not build models. We do not have a commercial interest in any provider's continued usage. We do not earn revenue per token. We sit outside the providers we monitor, on the user's machine, with cryptographic audit that can be verified without trusting us either.

The position is structural. We cannot become an AI provider without abandoning it. We cannot accept funding from a provider who would acquire us without abandoning it. We cannot build a feature that depends on a provider's cooperation in a way that compromises our independence from them. These constraints are not strategic preferences. They are the requirements of operating in the category we describe.

When people ask why a provider does not just build what we are building, the answer is not that they cannot engineer it. They can engineer it. The answer is that the structure of their business prevents the result from being the same product we are building, even if the code looks identical. Independence from the entity being audited is not a feature you can ship. It is a property of who is shipping the product.

What this means for buyers

If you are evaluating an AI defense product, the first question is not what its detection technology looks like. The first question is who issues its paychecks and what conflicts that issuance creates. A defense product operated by a provider will have a different threshold for finding that the provider's product produced a harm than a defense product operated by an independent party. This is true regardless of the integrity of the people involved.

The second question is whether the audit format the product produces can be verified without trusting the product's operator. If the answer is no, you have logging, not audit. Logging is useful. Audit is what survives a regulatory or legal proceeding.

The third question is whether the product can see across the providers your workflow actually uses. A defense layer that only sees one provider produces a security posture that scales with the number of providers your team uses. The first provider is covered. The second is not.

These three questions do not require knowing anything about Vigil. They are the questions that will determine which AI defense category survives the next two years.

A note for our friends at the providers

I have worked with many of you. The teams running safety functions at the major AI labs are doing serious work, often against significant institutional headwind, and most of you are right that you understand the model risks better than anyone outside your company. None of this post disputes that.

The argument is that even if you are correct, the structure does not let you be the auditor. That role has to sit outside, for the same reason auditors cannot sit inside the firms they audit, regardless of the auditor's competence. The structural separation is what makes the audit credible. Your work continues to matter inside the company. The audit layer needs to sit outside.

The earlier this is settled, the cleaner the regulatory framework will be when it arrives. The later it is settled, the more it will be settled by an incident that nobody wants.

We would all rather it be settled in advance.

← Back to The Vigil Journal