AI-Native Foundation
RAG Architecture
Query → LLM Gateway (LangGraph) → Embed (E5-large) + Retrieve (OpenSearch) → Prompt + PII Filter → LLM → Response + Audit.
| Component | Technology |
|---|---|
| Embedding | E5-large (English), BGE-M3 (multilingual) |
| Vector Store | OpenSearch |
| Retriever | Hybrid dense+sparse |
| LLM | On-prem Llama 3.1 70B / Claude API |
| PII Filter | Custom regex + NER |
MCP Endpoints
| Tool | Returns | Consent |
|---|---|---|
| get_profile | Profile subset | MCP_ACCESS |
| search_profiles | DTX_IDs (no PII) | ANALYTICS |
| get_segment_members | Sample DTX_IDs | ANALYTICS |
| resolve_identity | DTX_ID | MCP_ACCESS |
| get_segment_stats | Aggregate stats | None |
| create_segment | Segment ID | Admin only |
| get_identity_graph | Related entities | MCP_ACCESS |
LLM Integration
On-Prem: Llama 3.1 70B via vLLM. No data leaves infrastructure.
API: Claude/GPT-4 with PII stripped. Complex reasoning tasks.
Hybrid: On-prem for PII tasks, API for general knowledge. Router decides.
AI Guardrails
- Input: Prompt injection detection, PII masking, rate limiting, consent verification
- Output: PII leak detection, hallucination checking, response limits, toxicity filtering
- Operational: Human-in-the-loop, approval workflows, full audit logging