shipping production AI · since 2026 NAICS 541330 / 541511 / 541512 / 541519  ·  CMMC-aware
Refinery Report / Strategy / post · stions
StrategyAI EngineeringData ArchitectureGovernance

The Data Architecture Brief: Three Questions That Separate AI Projects That Ship From AI Projects That Die

Most enterprise AI projects do not fail at the model. They fail at three questions that should have been answered before the first sprint: is the data actually ready, who owns and governs access, and how will you know it works? This is the brief we run before writing a line of code.

D
DSE-Experts
Operator-led practice
May 27, 2026
6 min · 1,384 words

Executive Summary

Most enterprise AI projects do not die because the model was wrong. They die because three questions were never answered before the build started: Is the data semantically ready, or just present? Who owns governance, and how is access actually enforced? What does “done” mean, and how will you measure it in production? This is the architecture brief we run at the front of every engagement. Answer all three with evidence and you have a project that ships. Skip any one and you have a pilot that quietly stalls in month four.


The Failure Pattern Nobody Names

Walk into a stalled enterprise AI project and the post-mortem almost always points at the model. Wrong embedding choice. Bad prompt. The vendor oversold the benchmark. These are the symptoms everyone is comfortable discussing because they are technical, bounded, and someone else’s fault.

The real cause is upstream. The project was greenlit before anyone could answer three questions in writing. The team started building because building feels like progress, and the questions were treated as things to “figure out as we go.”

You do not figure them out as you go. You discover, in month four, that the data was never structured to answer the question being asked, nobody can approve who sees what, and there was never an agreed definition of working. By then the budget is half gone and the demo still only works on the three documents from the original notebook.

This brief is the antidote. It is three questions. We do not start an engagement until all three have evidence-backed answers, not opinions. Run it on your own roadmap before your next AI investment.

Question 1: Is the Data Ready, or Just Present?

There is a difference between data existing and data being usable for the task. Presence is a database with rows in it. Readiness is whether those rows carry the meaning your AI system needs to reason correctly.

The trap is that presence is easy to demonstrate and readiness is not. A stakeholder shows you a 40-table warehouse and says “the data is all there.” It is there. It is also undocumented, full of overloaded columns where status means six different things depending on the source system, and joined on keys that silently changed format in 2023.

What to actually interrogate

The honest answer to Question 1 is frequently: “The data is present and the data is not ready.” That is a finding, not a failure. It tells you the first sprint is data engineering, not model selection — and it saves you from training a system on a foundation that confidently produces wrong answers.

Question 2: Who Owns Governance, and How Is Access Enforced?

The second question kills more projects in regulated and federal environments than any technical constraint. It is deceptively simple: who is allowed to see what the system produces, who approves that, and how is it enforced in the architecture rather than in a policy document?

Teams routinely defer this. They build against a copy of production data in a dev account, get a great demo, and then discover that the data they used can never legally flow through the system they designed. The retrieval layer has no concept of row-level permissions. The model can surface a document to a user who was never cleared for it. There is no audit trail showing who asked what.

What to actually interrogate

Governance is an architecture decision, not a compliance checkbox bolted on before launch. If access control is an afterthought, it becomes the thing that prevents production deployment entirely.

Question 3: What Does “Done” Mean, and How Will You Know It Works?

The third question is the one most teams cannot answer, and its absence is the quietest killer. Without it, quality regresses silently. A prompt change improves one case and breaks four others, and nobody notices until a user complains in production.

“Done” is not “the demo worked.” Done is a measurable definition of acceptable behavior, plus a mechanism that tells you whether you still meet it after every change.

What to actually interrogate

The discipline here is borrowed from software engineering and applied to a probabilistic system: you do not get to call it working until you can prove it is working and prove it stays working.

The Brief as a One-Page Gate

Run the brief as a literal gate before funding the build. Each question gets an evidence-backed answer, not an aspiration.

Question Evidence required to pass Common failing answer
1. Data readiness Documented semantic integrity, provenance, and coverage for the specific task “The data is all there”
2. Governance & access Named owner, enforcement at retrieval/API layer, audit trail design “Only the right people have access”
3. Definition of done Written correctness criteria, eval harness, regression gate, production observability “The demo worked great”

If any row is a failing answer, the project is not ready to build. The first work item is to convert that failing answer into a passing one — which is real, fundable, scoped work, and far cheaper than discovering the gap in month four.

What This Means For You

The three questions are not gatekeeping for its own sake. They are the difference between an AI investment that compounds and one that becomes a line item your CFO asks about next budget cycle.

Before your next AI project gets a sprint plan, get the brief answered in writing. If the answers are honest and uncomfortable, you have just saved a quarter of wasted effort. If the answers come back clean and evidence-backed, build with confidence — you are in the minority that actually will ship.

The model was never the hard part. The brief is.


This brief reflects the assessment framework our team runs at the front of enterprise and regulated-industry AI engagements. It is published as a reference for data and technology leaders evaluating AI investments.

Schedule a Consultation | Request an Assessment

P
Founder · Principal Engineer
Data & AI engineer · 10+ yrs hands-on

Writes most of the long-form here. Lives in the codebase. Active on GitHub and LinkedIn.

§ Next step

Not sure which of these is you?

Tell us what's broken in a paragraph and a principal reads it directly — or walk the ladder from a low-commitment first engagement up to retained work.

One long-form a week. No marketing.

Subscribe to the Refinery Report. Practitioner deep-dives on AI engineering, security, and the realities of running production systems. Unsubscribe in one click.

~12 issues / quarter