Guide
Semantic Drift Detection: finding the meaning conflicts inside your enterprise
Most enterprises don't have a data problem. They have a meaning problem. The same word — customer, claim, incident, coverage — appears in dozens of policies, glossaries, APIs and AI prompts, and each one carries a slightly different interpretation. That gap is semantic drift, and it is what causes two correct systems to produce two different answers.
What semantic drift actually is
Semantic drift is the gradual or sudden divergence between the intended meaning of a business term and the meanings that are actually used in the systems, documents and decisions of the organization. It is invisible at the data layer — every value is valid, every API returns 200 OK — but it surfaces as inconsistent decisions, audit findings, and unreliable AI behavior.
A regulator asking "what does vulnerable customer mean in this report?" expects one answer. If three departments produced the report and each used a slightly different definition, the answer cannot be given without a reconciliation exercise that takes weeks. That reconciliation cost is the price of unmanaged drift.
Why traditional tools miss it
- Data catalogs describe columns and lineage. They assume the column's meaning is given. They cannot compare the natural-language definitions attached to two different sources.
- Wikis and Confluence store one page per term per author. There is no concept of canonical resolution, no conflict graph, no versioning of meaning.
- RAG and vector search retrieve whatever exists, including contradictory definitions. The AI then averages them, which is worse than picking one.
- Model governance suites govern the model. They do not govern the meaning the model acts on.
How WikiSure detects drift
- Ingest the corpus — glossaries, policies, contracts, API specs, internal wikis, AI prompts.
- Extract candidate concepts and their definitions from each source, preserving provenance.
- Compare definitions of the same concept across sources, scoring alignment and classifying the type of disagreement.
- Resolve each conflict against the canonical entry in the registry, and emit a structured drift finding with severity and recommended action.
- Record the governance decision so the same drift never has to be resolved twice — see organizational learning.
What drift detection unlocks
Once drift is visible, three things change. Auditors can be answered in hours instead of months. AI agents resolve every critical term to the same governed meaning, so their decisions become reproducible. And the organization stops paying the silent reconciliation tax every time a new project starts.
See it on real data
WikiSure has imported 120 insurance concepts from IAIS, IFRS 17 and Solvency II sources and detected drift against the canonical registry. See the validation snapshot, or browse the public definitions registry.
Frequently asked
What is semantic drift detection?
Semantic drift detection is the practice of automatically finding cases where the same business term is defined or used differently across documents, systems, teams, or AI agents — even when no software is broken. It surfaces the underlying meaning conflicts that cause inconsistent decisions, failed audits and unreliable AI output.
How is semantic drift different from data drift?
Data drift describes statistical change in the values flowing through a model. Semantic drift describes change or disagreement in the meaning of a term. Two systems can be perfectly aligned on data quality and still produce contradictory results because they interpret 'customer', 'claim' or 'incident' differently.
Where does semantic drift typically come from?
Mergers, parallel projects, copy-pasted definitions, AI-generated documentation, and undocumented business rules. Every time a term is re-explained by a different team, the meaning forks a little — and over years the forks accumulate as semantic debt.
Why can't a data catalog detect semantic drift?
Data catalogs document columns, lineage and ownership. They do not compare the natural-language meaning of a term across sources and they do not resolve which interpretation is canonical. Drift detection requires a semantic governance layer above the catalog.
What does WikiSure return when drift is found?
Every drift event is captured as a structured record with the conflicting interpretations, the canonical meaning, the affected sources, a severity rating, and a recommended governance action. The full history is queryable for audit.
Run a free semantic drift scan on your own glossary.
Run Free Semantic Scan →