Skip to main content

AI-Native Foundation

RAG Architecture

Query → LLM Gateway (LangGraph) → Embed (E5-large) + Retrieve (OpenSearch) → Prompt + PII Filter → LLM → Response + Audit.

ComponentTechnology
EmbeddingE5-large (English), BGE-M3 (multilingual)
Vector StoreOpenSearch
RetrieverHybrid dense+sparse
LLMOn-prem Llama 3.1 70B / Claude API
PII FilterCustom regex + NER

MCP Endpoints

ToolReturnsConsent
get_profileProfile subsetMCP_ACCESS
search_profilesDTX_IDs (no PII)ANALYTICS
get_segment_membersSample DTX_IDsANALYTICS
resolve_identityDTX_IDMCP_ACCESS
get_segment_statsAggregate statsNone
create_segmentSegment IDAdmin only
get_identity_graphRelated entitiesMCP_ACCESS

LLM Integration

On-Prem: Llama 3.1 70B via vLLM. No data leaves infrastructure.

API: Claude/GPT-4 with PII stripped. Complex reasoning tasks.

Hybrid: On-prem for PII tasks, API for general knowledge. Router decides.

AI Guardrails

  • Input: Prompt injection detection, PII masking, rate limiting, consent verification
  • Output: PII leak detection, hallucination checking, response limits, toxicity filtering
  • Operational: Human-in-the-loop, approval workflows, full audit logging