This is what governance failure looks like in the agent era… no alarms, no audit log entries, no policy violations in the traditional sense. Just a wrong number in front of an executive, and a paper trail that looks completely normal.
Governance for agent consumers is the set of policies, controls, and infrastructure that catches the failure modes specific to LLMs reading and reasoning on enterprise data. It is structurally different from human-era data governance, because agents fail differently than humans do. The governance frameworks most data orgs currently have were built for human consumers behaving in mostly-predictable ways. They are not sufficient for agent consumers.
Failure modes human-era governance doesn't catch
Speed-and-scale failures. A human asks one question at a time, gets an answer, considers it, then asks the next. An agent can issue ten thousand well-formed queries in a minute, recombining the answers into reasoning no human would have constructed. The audit log shows ten thousand legitimate queries, each individually within policy while inferences derived from the combination are nowhere in the logs. Traditional governance frameworks don't surface this because the unit of governance is the query, not the inference, while the unit of agent failure is the inference.
Silent cross-boundary inferences. Row-level security and role-based access were designed for humans who, if they could see two pieces of data, were trusted to recombine them in sensible ways. Agents are not trusted in the same sense, and they have no intuition about which combinations would surprise a human reviewer. An agent that can see customer data and pricing data can infer customer-by-customer pricing patterns. Traditional governance system flags this in a human-driven model, but not necessarily in an agent-driven model.
Confident hallucinations at executive scale. The agent produces a wrong number wrapped in confident language and clean formatting, then sends it through normal channels. The governance question your existing frameworks ask is "did the user follow policy?" The governance question for agents is "is the output of this system actually correct, and how would we know if it wasn't?"
What agent-era governance has to add
Here are things that have to exist at the level of designed infrastructure:
Eval frameworks running continuously. The agents producing executive-facing output need to be evaluated against ground truth on an ongoing basis and not just at deployment because drift is real. Behavior changes as the underlying model is updated, as the data shifts, as new edge cases get hit. A governance posture that ships at-launch is a governance posture that decays silently.
Loud failure modes by design. Most current agent deployments fail silently. They produce a wrong answer, format it cleanly, and pass it along. Agent-era governance has to design the opposite. The agent's uncertainty has to be exposed and confidence has to be calibrated. The system has to be allowed (and required) to say "I'm not confident in this answer" in a way that downstream humans can act on.
Boundary policies for inference and not just access. The question "what can this agent see?" is an “old governance” type question. The new question is "what conclusions can this agent draw, and which ones should be flagged for human review?" That's a different policy surface.
Why this is upstream of compliance, not downstream
The NIST AI Risk Management Framework explicitly notes that governance of AI systems requires technical evaluation infrastructure as a foundational layer, not as an audit afterthought. The compliance documentation is built on top of the eval and observation systems, not in place of them.
Data leaders who treat agent governance as a compliance task end up writing policies that no infrastructure supports. The policies look good on paper, fail to detect anything in practice, and produce the worst of both outcomes: regulatory exposure plus operational failure.
The leaders who treat agent governance as infrastructure first, and compliance second, end up with systems that actually catch the failures.
If this resonates, subscribe. And forward it to your governance and compliance partners, who probably need to be part of this conversation but often aren't yet.
— Kyle
