Skip to main content

Technical Architecture

Technology Stack

ComponentTechnologyPurpose
StreamingApache Flink 2.1Real-time event processing
BatchApache SparkBatch processing and analytics
LakehouseApache IcebergCold storage table format
Hot CacheHazelcast/DragonflyReal-time profile access
Profile StorePostgreSQL + ScyllaDBStructured profiles
Identity GraphScyllaDBGraph traversal at scale
Event StoreIceberg on S3/MinIOHistorical events
Vector StoreOpenSearchEmbeddings
Message QueueApache KafkaEvent streaming
Schema RegistryApicurioSchema management
OrchestrationTemporalDurable workflows
WorkflowsFlowableApprovals
SecretsHashiCorp VaultKey management
MonitoringPrometheusMetrics and alerting
UIReact + React FlowVisual canvas
MLOpsMLflowModel registry
LLMLangGraphAgent orchestration
CatalogDataHubMetadata management
OLAPClickHouseAnalytics queries

Deployment Models

Sovereign On-Prem: Full in-customer deployment, air-gapped option, customer Kubernetes.

Private Cloud: Customer's AWS/Azure/GCP tenancy, no shared infrastructure.

Hybrid: Core on-prem, analytics in cloud if permitted, DMZ proxy.

Scalability Targets

ComponentTargetTechnology
Profiles100M+ per tenantScyllaDB by DTX_ID
Events/second100K+Flink + Kafka
Graph edges1B+ScyllaDB wide rows
Segment eval<500ms p95Flink streaming
Profile lookup<50ms p95Hot cache + fallback