Skip to content

obs: Implement structured health check endpoints with dependency status #218

@Xhristin3

Description

@Xhristin3

Problem Statement

The health check (api/src/health/health.controller.ts) only verifies database connectivity. There's no distinction between liveness (is the process alive?) and readiness (can it serve traffic?). There are no checks for WebSocket gateway health, memory usage, or upstream dependency status.

Evidence

// api/src/health/health.controller.ts — single endpoint
@Get()
async check(): Promise<HealthCheckResponseDto> {
  await this.healthCheckService.check([
    async () => this.databaseHealthIndicator.isHealthy("database"),
  ])
  return { status: "ok", timestamp: new Date().toISOString() }
}

Impact

Kubernetes cannot distinguish between a process that is alive but unable to serve traffic (e.g., DB connection lost temporarily) and a healthy process. Downtime detection is delayed.

Proposed Solution

  1. Add GET /health/live — liveness probe (always returns ok if process is running)
  2. Update GET /health — readiness probe (checks all dependencies: DB, memory threshold, event loop lag)
  3. Add NestJS memory usage check and event loop lag check

Acceptance Criteria

  • GET /health/live returns { status: "ok" } even if DB is down
  • GET /health returns 503 if DB is unreachable
  • Readiness includes memory usage threshold check
  • Kubernetes manifests reference correct probes

File Map

  • api/src/health/health.controller.ts — add /live and enhanced /health
  • api/src/health/memory.health-indicator.ts — new
  • api/src/health/health.module.ts — update providers

Labels: observability, infrastructure
Priority: Medium | Difficulty: Beginner | Estimated Effort: 1d


Labels: observability,infrastructure
Priority: Medium | Difficulty: Beginner | Estimated Effort: 1d
Backlog ID: REPO-035

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions