Skip to content

obs: Add Prometheus metrics endpoint for request counting and latency #82

@YaronZaki

Description

@YaronZaki

Problem Statement

No metrics instrumentation exists. No Prometheus /metrics endpoint, no prometheus_client, no OpenTelemetry. No way to monitor request rates, latencies, error rates, or business metrics (positions created, vault deposits).

Evidence

  • No metrics endpoint in quantara/web_app/api/
  • No prometheus_client in pyproject.toml
  • No middleware instrumenting request counts or latencies

Impact

Medium — monitoring blind spot. Capacity planning is guesswork without request rate data. Production incidents detected by users, not alerts. Cannot set SLOs (99.9% latency < 500ms) without latency histograms.

Proposed Solution

Add prometheus_client with FastAPI middleware instrumenting: http_requests_total (count), http_request_duration_seconds (histogram), http_requests_in_flight (gauge). Expose /metrics endpoint behind authentication or internal-only. Start with request metrics, add business metrics later.

Acceptance Criteria

  • Prometheus /metrics endpoint exposed (configurable auth)
  • Request counter instrumented: total requests by method, endpoint, status
  • Latency histogram instrumented: request duration buckets
  • Error rate tracked by status code
  • All existing tests pass

File Map

  • quantara/web_app/api/metrics.pyNew: Prometheus metrics endpoint and middleware
  • quantara/web_app/api/main.py — add metrics route and middleware
  • quantara/pyproject.toml — add prometheus_client dependency

Dependencies

  • Related: REPO-040 (health check should be monitored via Prometheus)

Testing Strategy

  • Unit: Test metrics middleware increments counters correctly
  • Integration: Send requests, query /metrics, verify counters and histograms populated

Security Considerations

Metrics endpoint exposes request patterns. Protect with authentication or restrict to internal network. Don't expose business-sensitive metrics publicly. Avoid high-cardinality labels (wallet_id as label would explode cardinality).

Definition of Done

  • Code implemented and peer-reviewed
  • Tests written and passing
  • Documentation updated
  • PR linked and merged

Labels: observability
Priority: Medium
Difficulty: Intermediate
Estimated Effort: 3h

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions