Beyond the Pilot: The Data Architecture of Agentic Government
Why "Data Readiness" is the actual bottleneck for AI at scale and how to build a federated foundation that trusts but verifies.
I. The Pivot from “Chat” to “Do”
I have a confession: I am a conference survivor.
I’ve lived through the great Cloud migration, the Agile awakening, and the DevOps world tours. For fifteen years, I’ve had the privilege of standing on stages discussing these “new” frontiers. And while there were moments where it felt like the record was skipping, where we were debating the same points for the third decade in a row, I look back with an incredible amount of gratitude. Those cycles weren’t just noise; they were the hard, necessary work of leveling up the federal government’s digital backbone. We had to learn how to move fast (Agile) and where to live (Cloud) before we could ever hope to do what comes next.
And what comes next is finally here.
Now, we have a “new” old topic: AI. It’s easy to joke that every product now has an “AI inside” sticker (I’m still waiting for an AI-enabled office chair), but the truth is, this isn’t a flash in the pan. We’ve been in the trenches with this since at least 2017. For the last few years, we’ve been captivated by GenAI. I’m actually on three panels this week alone and I couldn’t be more energized. We are finally moving past the theory and into the “how.”
Over the past few years, federal agencies have enjoyed the “Summer of LLM Pilots.” It was a season of exploration: try two different LLMs, see what sticks, and find a use case. We successfully proved that AI can summarize a memo and draft a decent email (which, let’s be honest, has saved us all from a lot of Friday afternoon writer’s block).
But the honeymoon phase is winding down. We are moving from a world where AI simply talks to us, to a world where AI works for us. We haven’t yet fully prepared our data to let an agent autonomously execute a procurement action or adjudicate a benefit. Moving to Agentic AI, systems that actually execute tasks rather than just talking about them, isn’t a “model” problem; it’s a metadata and mandate problem.
If your data isn’t machine-actionable, your agents are just high-priced chatbots. Welcome to 2026: The Year of the Agent. This isn’t just another pilot season; it’s the season of implementation. And I, for one, am glad I’ve got the mileage from the last 15 years to help navigate it.
II. Redefining “Data Readiness” for the Agentic Era
In the pilot phase, “clean” data was the gold standard. For agents, we require Contextual Readiness. When I was releasing the first chatbot solution for my agency, I had the cyber team use their tools to “lock down” the environment—restricting files, folders, and drives to prevent data leakage between offices. It was a success; nothing spilled.
But then we hit the “Dark Data” PDF Trap. The chatbot wasn’t leaking information, but it was pulling “ghosts” from the past-decades of conflicting policy hidden in unstructured formats. When you haven’t properly labeled, archived, or contextualized your data, your agent can’t distinguish between a 1994 memo and a 2024 regulation. This creates reasoning conflicts that paralyze an agent.
This is the reality check for every CIO, CAIO, and CDO: You cannot scale what you cannot map. If an agent doesn’t know the “intent” or the “freshness” of a data point, it cannot be trusted to take an action.
III. Architecture: The False Choice Between Centralized and Federated
We are often told we have to choose: Centralization (which creates massive bureaucratic bottlenecks) or pure Federation (which creates “data wild-westers” and security nightmares).
You don’t have to choose. My recommendation is “Federated Data, Centralized Governance.” This is where the Data Mesh approach comes in. In a Data Mesh, we stop trying to shove everything into one giant “data lake” that inevitably becomes a “data swamp.” Instead, we treat data as a product. The domain experts, the people in HR, Finance, or Program Offices, remain the owners of their data. They know it best.
Governance, however, becomes the Centralized Control Plane. We centralize identity, policy-as-code, and security guardrails. We don’t move the data to the AI; we move the AI’s “view” to the data using the Model Context Protocol (MCP) and virtualization. This allows agents to query live, authoritative data at the source without creating redundant (and risky) copies.
IV. The Trust Stack: Lineage, Provenance, and Accountability
In government, an agent’s recommendation isn’t just an output; it’s a potential legal liability. If an agent takes an action, we must be able to audit it with the same rigor we use for human federal employees.
I call this the Traceability Trio:
Provenance: Where did the data start? Is this a certified, authoritative record or a draft from a SharePoint folder?
Lineage: How did the agent transform or interpret that data? What steps were taken between the query and the conclusion?
Immutable Audit: We need a non-erasable “black box” for agentic decision-making.
The governance mantra for the next decade is simple: “If it isn’t logged, it didn’t happen. If it isn’t traceable, it isn’t trustable.”
V. Collaboration: Solving the “Semantic Dissonance”
The biggest hurdle to cross-agency AI isn’t the tech; it’s Semantic Dissonance. This is the multi-domain challenge: Why cross-agency AI fails when Agency A’s definition of a “target” is a person, but Agency B’s definition is a geographic location. If two agents from different domains try to collaborate without a shared understanding, the results are catastrophic.
The fix isn’t sharing every database, it’s building shared ontologies. Agencies don’t need to share everything; they just need to agree on what the “things” are called and how they relate to one another. We need to build a common language for agents to speak before we ask them to work together.
VI. Conclusion: Measuring What Matters (Beyond ROI)
Metrics matter. They are the only way we decide what is worthy of our limited budget.
When I first started measuring ROI for our “chatbot army,” I focused on time and efficiency—how many minutes were saved per task. It was helpful, and the significant time savings increased adoption. But as we move to agents, we need to shift our gaze.
We need to move from “Cost Savings” to Mission Velocity and Intervention Rates.
Mission Velocity: How much faster are we delivering services to the citizen?
Intervention Rate: How often does a human have to step in and correct the agent?
As the intervention rate drops, your scale increases. Agentic AI is the most significant shift in federal IT since the cloud. It requires us to stop being “data librarians” and start being architects of autonomy.


