Your AI Doesn't Know What Revenue Means

A governed business glossary isn't a documentation project. It's the only thing stopping your AI from reporting two different numbers with equal confidence.

Two teams at the same company run the same AI-assisted revenue report. Finance pulls one number. Sales pulls another. Both outputs look clean. Neither model flagged an error, because neither model made one. Each inferred a locally valid definition of "revenue" from the data it was given. The conflict isn't inside either model. It's between them, and it shows up in the board deck.

That's not a data pipeline failure. That's a definitional conflict your AI inherited and had no way to resolve.

The problem isn't bad data, it's contested meaning

Most analytics leaders, when AI outputs contradict each other, go looking for the broken pipeline. They audit the ETL jobs, check the joins, review the transformation logic. The infrastructure is usually fine. What's broken is upstream of all of it: two source systems using the same word to mean different things, and no governance layer telling the model which definition applies when outputs need to be compared.

Large language models are genuinely good at inferring meaning from context within a document. Feed one a messy contract and it will parse "customer" correctly from the surrounding language. That capability is real and it's worth what vendors charge for it. The problem is that enterprise AI doesn't operate within documents. It operates across departments, source systems, and reporting layers. Contextual inference works inside a boundary. It doesn't work across one.

Why the counterargument sounds right

The reasonable objection here is that modern AI, specifically embedding models and LLMs with retrieval, was built to handle semantic variation. Requiring a governed glossary before deploying AI sounds like asking a GPS to wait for a paper map. The objection has genuine force. These systems do handle ambiguity better than any prior generation of analytics tooling.

What they don't do is tell you when two outputs measuring "the same thing" are measuring different things. The model that inferred "revenue" as gross bookings and the model that inferred it as recognized revenue both produced coherent outputs. The failure is invisible until someone puts those outputs side by side. At that point, no model can fix it, because the problem isn't in the models.

What a governed glossary actually does

A business glossary, when it's built and enforced, doesn't teach your AI what words mean. It tells two AI systems operating on different data sources that they are measuring the same construct, so their outputs are comparable. That's a governance function, not a language function. No amount of model sophistication substitutes for it.

The build plan that gets used rather than filed has four features. Start with the ten terms that cause the most cross-functional friction — not the most terms, the most contested ones. Get a named owner for each definition, not a team. Tie the glossary to the systems that produce the data, not to a wiki page that lives next to the data dictionary. And version the definitions: when "active customer" changes meaning because the business model changed, the models trained before that change need to be retrained or their outputs need to be scoped.

I've watched organizations spend eight months building glossaries in Confluence that nobody opened after launch. The glossary wasn't wrong. It was just disconnected from the tools analysts actually used to query data. A definition that lives in Confluence while the model lives in Databricks isn't a governance layer. It's a document.

The failure mode you're not measuring

The research synthesis this piece draws on positions semantic inconsistency as the root failure mode in AI-assisted reporting, not data volume, not model selection. [Inference] The organizations most exposed to this failure are the ones where AI deployment moved faster than vocabulary governance, which is most of them. The contested definitions were always there. AI made them consequential at scale.

Your AI will report whatever the data tells it to report. If the data encodes a definitional conflict, the output encodes it too, and it does so confidently, cleanly, and without a warning label.