From Human-Friendly to AI-Ready: Rethinking Data Quality for Governance

Setting the Scene

Data quality has long been recognized as a cornerstone of reliable decision-making and trustworthy outputs, particularly in the age of AI. Many organizations have invested significant effort in improving the quality of their data, such as preparing source texts for chatbot interactions. In highly regulated sectors like health insurance, this challenge is amplified by the volume of legal and operational regulations and constant changes, like those affecting reimbursement rules.  


Before AI, data quality was aimed at human use: ensuring staff could find, interpret, and apply information correctly. However, with the rise of Generative AI, especially Retrieval-Augmented Generation (RAG), many organizations attempted to simply migrate existing human-centered content into knowledge bases for AI consumption, expecting equally good if not better results. In practice, Gartner warns that traditional data-quality standards fall short of AI-readiness, while IBM Research finds that investing early in structured, machine-interpretable data can cut data-preparation time by up to 80%. The evidence is clear: migrating human-centric content ‘as is’ into AI systems isn’t just inefficient — it’s a governance risk.


Our case example comes from a mutual, regulated health insurance organization operating within the European model (as seen in Belgium, France, Germany, and other European countries, including Switzerland albeit with some variations). Over time, this organization accumulated thousands of documents categorized by domain (insurability, care, reimbursements, etc) and by theme (pediatrics, pregnancy, chronic diseases, geriatrics, etc). These included procedures, explanatory notes, examples, request forms, annotations, interpretative memos, and even training exercises and questionnaires. When this high-quality knowledge base was exposed to an AI system, the results were disappointing...


Pitfalls & Governance Failure

Despite the exemplary quality of the content, the AI delivered outputs that were inconsistent, random, and sometimes confusing. The failure was not due to bad data, but to data optimized for human consumption, not for AI.  


Key failures included:  

  • Mixed content types: The AI could not distinguish between different content categories such as rules, examples, forms, and practitioner notes—treating all with equal weight and blending them in responses.  

  • Blindness to visuals: Visual elements (highlights, annotations, strikethroughs, or color codes) conveyed meaning to humans but were invisible to the AI.  

  • Lack of semantic tagging: Without metadata, the AI had no way to differentiate between document types, audiences, or domains, leading to unreliable retrieval and interpretation.  


That is, data quality was managed for human comprehension but not for machine reading, and governance failed at the intersection of Information Architecture and AI Readiness.


Lessons Learned

Quality for humans is not equivalent to quality for machines: AI systems require information that is structured, tagged, and semantically consistent to produce meaningful results. The key insight is that the shift from human to AI use changes the definition of “data quality” itself.  

Critical lessons learned include:  

  • Information must be reengineered for machine consumption, not just migrated.  

  • Contextual differentiation (such as distinguishing rules from commentary) is essential for reliable AI reasoning.  

  • Hidden meaning in visuals must either be made explicit through tagging or supplemented with machine-readable equivalents.  

  • Information architecture and metadata governance are more important than ever in the AI era.  


Organizations that overlook these factors risk undermining both their AI initiatives and their broader governance objectives.

Governance Actions  

Effective AI governance must ensure AI usability of the data. This means embedding information architecture principles within AI governance frameworks.  

Recommended governance actions include:  

  • Metadata: Tag every document with its type, intended audience (e.g., front office, back office, training), domain, and thematic area.  

  • Editorial and structuring guidelines: Define internal content standards that distinguish between rule-based, interpretative, and procedural content.  

  • Machine readability: Introduce inline tags or markup to convey meaning currently carried through visual formatting.  

  • AI-specific data quality controls: Add validation criteria for machine interpretability alongside traditional human readability measures.  

  • Ongoing governance: Establish policies ensuring that future content creation automatically aligns with AI-readiness standards.  


Turning data from “human-ready” to “AI-ready” is now a strategic necessity, not a nice-to-have. Gartner forecasts that global IT spending will rise 7.9% in 2025 to US$5.43 trillion, with data center systems leading at ~42.4% growth. That trend makes clear where the money is going: infrastructure for AI, not just applications. Without structural clarity, rich metadata, and semantic tagging, organisations risk having high-quality human data that derails AI systems. The organizations that align their Information Architecture with AI expectations will be the ones reliably scaling trustworthy systems in the years ahead.

Christian & Sana Yaakoubi

Christian DE NEEF

Seasoned business consultant, project director, working at the crossroads of innovation management, knowledge management & learning organizations. Looking at things from a distance and easily recognizing patterns/connecting the dots. Passionate when helping organizations in their transformation.

Founder and co-owner of FastTrack, a Brussels-based consultancy that facilitates the people side of knowledge and innovation management: we focus on change, collaboration, culture, and how these are key ingredients in lasting organizational transformation.

https://www.fasttrack.be
Previous
Previous

Geneva’s Blueprint for Responsible Public-Sector AI

Next
Next

Quelles applications de l’IA pour le KM, au-delà du chatbot?