Skip to content

obs: add OpenTelemetry distributed tracing #45

@snowrugar-beep

Description

@snowrugar-beep

Problem Statement

No distributed tracing exists. When a request flows through the backend (HTTP → Validation → GeoService → IPFS → Soroban → Database → Response), there is no way to trace the full request lifecycle across service boundaries.

Evidence

  • No tracing instrumentation in NestJS application
  • OpenTelemetry collector config referenced but missing (monitoring/otel-collector.yml — FILE_DOES_NOT_EXIST)

Impact

Cannot trace requests across service boundaries (HTTP, database, Soroban, IPFS). Hard to debug latency issues in production.

Proposed Solution

  1. Install @opentelemetry/instrumentation-express, @opentelemetry/instrumentation-http, @opentelemetry/instrumentation-typeorm
  2. Configure NestJS OpenTelemetry module
  3. Create correlation ID middleware for request tracing
  4. Add span attributes for key operations (geohash encoding, IPFS pin, Soroban post)
  5. Export traces to OpenTelemetry collector
  6. Create otel-collector.yml configuration

Technical Requirements

  • Must not significantly impact performance
  • Must propagate trace context across async boundaries
  • Must include database query spans
  • Must sample traces (not trace every request in production)

Acceptance Criteria

  1. HTTP requests create trace spans
  2. Database queries appear as child spans
  3. External service calls (IPFS, Soroban) appear as child spans
  4. Trace context propagates across async operations
  5. Traces reach OpenTelemetry collector
  6. Sampling rate configurable via environment variable
  7. All existing tests pass

File Inventory

  • Backend/src/main.ts
  • Backend/src/common/interceptors/tracing.interceptor.ts (new)
  • Backend/package.json
  • monitoring/otel-collector.yml

Dependencies

Issue #10 (monitoring config) — otel-collector.yml must be created.

Testing Strategy

  • Unit test: verify spans are created and propagated
  • Integration test: verify traces reach collector
  • Performance test: verify overhead is minimal

Security Considerations

Traces should not contain sensitive data (passwords, tokens, PII). Use span attribute filtering.

Definition of Done

  • OpenTelemetry instrumentation implemented
  • Traces reaching collector
  • No sensitive data in traces
  • Tests passing

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions