Observability API
Unified observability platform integrating Prometheus (metrics), Jaeger (traces), and OpenSearch (logs) with real-time WebSocket streaming.
| Field | Value |
|---|---|
| Port | 50010 |
| Base Path | /api/v1 |
| OpenAPI | 3.1.0 |
| Spec | Observablity_API.json |
Traces (Jaeger)
| Method | Path | Description | Key Query Params |
|---|---|---|---|
| GET | /api/v1/traces/search | Search traces | serviceName, operationName, traceId, startTime, endTime, minDurationMicros, maxDurationMicros, limit (max 100), lookback (1h/24h/7d) |
| GET | /api/v1/traces/{traceId} | Get trace by ID | — |
| GET | /api/v1/traces/services | List all services | — |
| GET | /api/v1/traces/services/{serviceName}/operations | List operations | — |
Metrics (Prometheus)
| Method | Path | Description | Key Query Params |
|---|---|---|---|
| GET | /api/v1/metrics/query | Instant PromQL query | query, time |
| GET | /api/v1/metrics/query-range | Range PromQL query | query, start, end, step |
| GET | /api/v1/metrics/labels | List all label names | — |
| GET | /api/v1/metrics/label/{labelName}/values | Get label values | — |
| GET | /api/v1/metrics/targets | List scrape targets | — |
Dashboard
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/dashboard/summary | Dashboard summary metrics |
| GET | /api/v1/dashboard/services | Service health overview |
| GET | /api/v1/dashboard/errors | Error rates and trends |
Pipeline Observability
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/pipelines/{id}/logs | Pipeline application logs |
| GET | /api/v1/pipelines/{id}/flink-logs | Flink-specific logs |
| GET | /api/v1/pipelines/{id}/operators/record-counts | Operator record metrics |
| GET | /api/v1/pipelines/{id}/stats | Pipeline statistics |
Unified Search
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/search | Search across logs, traces, and metrics |
WebSocket Streaming
Real-time observability data via WebSocket.
Connection Flow
Client → Server Actions
| Action | Type | Description |
|---|---|---|
subscribe | logs metrics traces dashboard all | Subscribe to data stream |
unsubscribe | logs metrics traces dashboard all | Stop stream |
ping | — | Health check (returns pong) |
configure | — | Adjust interval, batchSize, priority |
status | — | Get session status |
Subscribe Message
{
"action": "subscribe",
"type": "logs",
"data": {
"filter": { "level": "ERROR", "service": "pipeline-service" },
"interval": 5000,
"batchSize": 100
}
}
Server → Client Message
{
"type": "data",
"category": "logs",
"data": { ... },
"metadata": { ... },
"timestamp": "2024-01-01T12:00:00Z",
"sessionId": "session-123"
}
Message types: data, control, error, heartbeat
Data Sources
| Stream | Source | Example Filters |
|---|---|---|
logs | OpenSearch | level: "ERROR", service: "api-gateway" |
metrics | Prometheus | metric: "up", job: "spring-boot" |
traces | Jaeger | service: "user-service", error: true |
dashboard | Aggregated | Service health, error rates |
all | All sources | High volume — use with caution |
Rate Limiting
| Parameter | Value |
|---|---|
| Message limit | 120 messages/min per session |
| Cooldown | 60 seconds when exceeded |
| Stale cleanup | 30 minutes |
| Heartbeat | Every 30 seconds |
Error Codes
| Code | Description |
|---|---|
INVALID_ACTION | Unrecognized action |
RATE_LIMITED | Message rate exceeded |
SESSION_NOT_FOUND | Invalid session |
Frontend Integration
| File | Purpose |
|---|---|
services/observability/observability.service.ts | REST + WebSocket client |
services/observability/useDashboardObservability.ts | Dashboard React Query hook |
services/observability/usePipelineObservability.ts | Pipeline monitoring hook |
types/observability.types.ts | TypeScript types |