Operator Specifications
Data Quality (DQ-01 to DQ-07)
| ID | Operator | Type | Purpose |
|---|---|---|---|
| DQ-01 | Schema Validator | Config+SQL | Validates records against expected schema |
| DQ-02 | Format Normalizer | SQL | Normalizes formats (E.164, ISO8601, lowercase) |
| DQ-03 | Null Handler | Config | Handles nulls: DEFAULT, REJECT, FLAG, COALESCE |
| DQ-04 | Range Validator | SQL | Validates acceptable ranges (Age 0-120, etc.) |
| DQ-05 | Uniqueness Checker | SQL | Detects duplicates: FLAG, DEDUPE_FIRST/LAST/BEST |
| DQ-06 | Referential Validator | SQL | Validates foreign key relationships |
| DQ-07 | Quality Scorer | SQL | Computes record quality score (0-100) |
Transformation (TR-01 to TR-12)
Field Mapper, Type Converter, Expression Calculator, String Manipulator, Date Calculator, Aggregator, Window Function, Joiner, Filter, Deduplicator, Pivot/Unpivot, Lookup Enricher — all SQL, all user-editable.
Identity (ID-01 to ID-08) — System-Controlled
DTX_ID Generator, DTX_ID Resolver, Identifier Linker, Confidence Scorer, Graph Updater, Merge Detector, Split Handler, Household Resolver.
Classification (CL-01 to CL-04)
CL-01 SID Mapper — maps to TM Forum SID schema. Example:
mappings:
- source: customer_name
target: party.given_name
transform: "SPLIT(${source}, ' ')[0]"
- source: dob
target: party.age
transform: "DATEDIFF(YEAR, ${source}, CURRENT_DATE)"
CL-02 IAB Classifier, CL-03 System Tagger, CL-04 Custom Tagger.
Profile (PR-01 to PR-05) — System-Controlled
Profile Assembler, Merge Policy Applier (config editable), Lineage Tracker, Profile Writer (atomic upserts), Snapshot Creator.
Trait (TT-01 to TT-04)
-- TT-01: Trait Calculator
SELECT dtx_id, SUM(amount) AS total_spend_30d
FROM billing_events
WHERE event_date >= CURRENT_DATE - INTERVAL 30 DAY
GROUP BY dtx_id
TT-02 Score Calculator (Python/MLflow), TT-03 RFM Calculator, TT-04 Trend Analyzer.
Segment (SG-01 to SG-04)
Segment Evaluator, Segment Materializer (FULL_REFRESH / INCREMENTAL / STREAMING), Taxonomy Mapper, Overlap Analyzer.
Privacy (PV-01 to PV-06) — System-Controlled
Consent Checker (BLOCK/LOG/SOFT_OVERRIDE), PII Masker, Tokenizer, Encryption Handler (Vault), Retention Enforcer, DSAR Handler.
AI (AI-01 to AI-05)
Embedding Generator (E5-large/BGE-M3), Vector Writer (OpenSearch), LLM Prompt (Jinja2 templates, PII filter), RAG Query, GenAI Classifier.