Skip to main content

Observability API

Unified observability platform integrating Prometheus (metrics), Jaeger (traces), and OpenSearch (logs) with real-time WebSocket streaming.

FieldValue
Port50010
Base Path/api/v1
OpenAPI3.1.0
SpecObservablity_API.json

Traces (Jaeger)

MethodPathDescriptionKey Query Params
GET/api/v1/traces/searchSearch tracesserviceName, operationName, traceId, startTime, endTime, minDurationMicros, maxDurationMicros, limit (max 100), lookback (1h/24h/7d)
GET/api/v1/traces/{traceId}Get trace by ID
GET/api/v1/traces/servicesList all services
GET/api/v1/traces/services/{serviceName}/operationsList operations

Metrics (Prometheus)

MethodPathDescriptionKey Query Params
GET/api/v1/metrics/queryInstant PromQL queryquery, time
GET/api/v1/metrics/query-rangeRange PromQL queryquery, start, end, step
GET/api/v1/metrics/labelsList all label names
GET/api/v1/metrics/label/{labelName}/valuesGet label values
GET/api/v1/metrics/targetsList scrape targets

Dashboard

MethodPathDescription
GET/api/v1/dashboard/summaryDashboard summary metrics
GET/api/v1/dashboard/servicesService health overview
GET/api/v1/dashboard/errorsError rates and trends

Pipeline Observability

MethodPathDescription
GET/api/v1/pipelines/{id}/logsPipeline application logs
GET/api/v1/pipelines/{id}/flink-logsFlink-specific logs
GET/api/v1/pipelines/{id}/operators/record-countsOperator record metrics
GET/api/v1/pipelines/{id}/statsPipeline statistics
MethodPathDescription
GET/api/v1/searchSearch across logs, traces, and metrics

WebSocket Streaming

Real-time observability data via WebSocket.

Connection Flow

Client → Server Actions

ActionTypeDescription
subscribelogs metrics traces dashboard allSubscribe to data stream
unsubscribelogs metrics traces dashboard allStop stream
pingHealth check (returns pong)
configureAdjust interval, batchSize, priority
statusGet session status

Subscribe Message

{
"action": "subscribe",
"type": "logs",
"data": {
"filter": { "level": "ERROR", "service": "pipeline-service" },
"interval": 5000,
"batchSize": 100
}
}

Server → Client Message

{
"type": "data",
"category": "logs",
"data": { ... },
"metadata": { ... },
"timestamp": "2024-01-01T12:00:00Z",
"sessionId": "session-123"
}

Message types: data, control, error, heartbeat

Data Sources

StreamSourceExample Filters
logsOpenSearchlevel: "ERROR", service: "api-gateway"
metricsPrometheusmetric: "up", job: "spring-boot"
tracesJaegerservice: "user-service", error: true
dashboardAggregatedService health, error rates
allAll sourcesHigh volume — use with caution

Rate Limiting

ParameterValue
Message limit120 messages/min per session
Cooldown60 seconds when exceeded
Stale cleanup30 minutes
HeartbeatEvery 30 seconds

Error Codes

CodeDescription
INVALID_ACTIONUnrecognized action
RATE_LIMITEDMessage rate exceeded
SESSION_NOT_FOUNDInvalid session

Frontend Integration

FilePurpose
services/observability/observability.service.tsREST + WebSocket client
services/observability/useDashboardObservability.tsDashboard React Query hook
services/observability/usePipelineObservability.tsPipeline monitoring hook
types/observability.types.tsTypeScript types