3rd Week AI Governance

Buffl

Data Gov & Ecosystems

by Luca I.

What is data governance?

it ensures

quality (accurate, complete)
integrity (consistent, trustworthy)
security (protected, controlled)
usablity (Accesible, documented)

of an organization’s data

-> spans the entire data lifecycle: from collection to deletion or archiving

The Data Lifecycle

Governance applies at every stage

Data Governance vs Data Management

Data Governance (decision rights & accountability)

traffic laws & zoning regulations

who can take what action
upon what data
using what methods

Data Management (Operations & Implementation)

the trucks, warehouses, logistic operations

the execution of collecting, processing, and using data effectively

Cost of poor data quality

15 - 25 % of revenue annually

Data Gov, Roles & Responsibilities

Data owner: business executive, accountable for data sets ploicies
Data steward: day to day management, implements ploicies
Data custodian: IT/ Technical role, storage & security, Access control

Every dataset should have a clearly identified owner

6 Dimensions of data quality

7 GDPR principles

Lawfulness & Transparency
Purpose Limitation
Data Minimization
Accuracy
Storage Limitation
Integrity & Confidentiality
Accountability

Accountability means you must prove compliance, not just claim it. This is why documentation and audit trails matter

GDPR Data Subject Rights

Response time: You must respond within 1 month.

Your database design must support these requests

GDPR penalties

Tier 1: up to 10M or 2% of global turnover

Tier 2: up to 20M or 4% of global turnover

EU AI Act

Risk based approach

Minimal (spam filters)
Limited Risk (Chatbots)
High Risk (credit scoring, hiring)
Unacceptable (social scoring)

Requirements for high risk AI Systems:

risk management system
high quality training datasets (bias-free, representative)
technical documentation & human oversight

From Data Governance to AI Governance

Additonal AI-specific concerns:

model behavior: is it fair? Explainable?
Training data: representative? Bias free?
deployment: Monitored? Auditable?

AI Risk Management Workflow

key AI Risk Categories

bias & Fairness: Discriminator outcomes
Data Leakage: PII in prompts/ outputs
Hallucination: Fabricated but credible outputs

Deep dive, AI Data Leakage

Type 1: privacy leakage	Type 2: ML Pipeline Leakage
Data exposed through AI Systems	Model “cheats” during training
sensitive data in prompts PII in model outputs model memorization of training data	target leakage train-test contamination

Red flag: If your model performs “too well,” it might be cheating. Always split data before any pre-processing!

AI Governance Implementation Checklist

start small: pick one high-risk system and govern it well before scaling

Audit Query, find duplicates (Uniquness)

Audit Query 2: Check Consistency

Audit Query 3: Find Missing Fields (Completeness)

Audit Query 4: Check Data Validity

Audit Query 5: Detect Potential Bias

Aggregation for Governance: Summary

Join Course

Preview

Author

Luca I.

Information

Last changed
4 months ago

Report course