Data Quality & Governance
Six Quality Dimensions
| Dimension | Measurement |
|---|---|
| Completeness | % records with all required fields |
| Accuracy | % passing validation rules |
| Consistency | % cross-source matches |
| Timeliness | % within SLA window |
| Uniqueness | % unique by key fields |
| Validity | % passing range/format checks |
Quality Scoring Weights
| Attribute | Weight | Rationale |
|---|---|---|
| MSISDN present | 20% | Primary identifier |
| Email present | 15% | Cross-channel activation |
| Name complete | 20% | Personalization |
| DOB present | 10% | Age targeting |
| Gender present | 5% | Demographic targeting |
| Account status current | 10% | Active customer |
| Traits computed | 20% | Behavioral intelligence |
Dead Letter Queue (DLQ)
Failed records route to DLQ with original record, error metadata, retry count. Processing: auto-remediate if possible → manual intervention → replay → escalate persistent failures.
Data Lineage
Complete source-to-profile lineage via OpenLineage standard. Tracks: source system, ingestion timestamp, pipeline/operator sequence, transformations, merge policies, profile version.
Schema Registry
Apicurio with: semantic versioning, compatibility checking (backward/forward/full), evolution tracking, breaking change detection. New fields added as nullable; type changes require migration pipeline.