feat(observability): add Prometheus metrics and Grafana dashboards#27
Conversation
Exposes a /metrics endpoint from the Go backend using promhttp, instruments all HTTP routes with request count and latency histograms, and exports DB connection pool stats via a custom prometheus.Collector. Prometheus and Grafana are wired into Docker Compose (ports 9090 and 3001) with a provisioned data source and starter dashboard covering request rate, p50/p95 latency, and DB pool connections. Closes #17.
|
Warning Review limit reached
More reviews will be available in 45 minutes and 4 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds end-to-end Prometheus observability to the Go backend: a Gin middleware instruments HTTP request counts and latency, a new ChangesPrometheus Metrics and Grafana Observability
Sequence Diagram(s)sequenceDiagram
participant Client as HTTP Client
participant GinMiddleware as PrometheusMiddleware
participant GinHandler as Route Handler
participant PromRegistry as Prometheus Registry
participant Prometheus as Prometheus Scraper
participant Grafana as Grafana
Client->>GinMiddleware: GET /api/...
GinMiddleware->>GinHandler: c.Next()
GinHandler-->>GinMiddleware: response status
GinMiddleware->>PromRegistry: Inc(httpRequestsTotal{method,path,status})
GinMiddleware->>PromRegistry: Observe(httpRequestDurationSeconds{method,path})
Prometheus->>GinMiddleware: GET /metrics (skipped by middleware)
GinMiddleware->>PromRegistry: promhttp.Handler() serves exposition
PromRegistry-->>Prometheus: text/plain metrics
Prometheus->>Grafana: PromQL query results
Grafana-->>Grafana: render Backend Overview panels
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
backend/.env.example (1)
21-25: 💤 Low valueOptional: Address static analysis hints.
The dotenv-linter flags alphabetical ordering and a missing trailing blank line. The current logical grouping (by feature/purpose) is more maintainable than strict alphabetical order, but you may choose to add a blank line at the end for consistency with linter expectations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/.env.example` around lines 21 - 25, The dotenv-linter has flagged a missing trailing blank line in the backend/.env.example file. Add a blank line at the end of the file after the GRAFANA_ADMIN_PASSWORD=admin line to satisfy linter expectations and maintain consistency with file formatting conventions. The current logical grouping by feature is acceptable and does not need to be changed to strict alphabetical order.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/internal/server/server.go`:
- Around line 40-45: In the prometheus.Register error handling block for the
dbCollector in the server initialization function, the
non-AlreadyRegisteredError case is only logging a warning instead of failing
startup. Remove the warning log for non-AlreadyRegisteredError cases and instead
return the error from the function (such as NewServer) so that real registration
failures cause startup to fail fast. Keep the AlreadyRegisteredError check to
allow idempotent registration, but propagate any other error type up the call
stack rather than silently continuing execution.
In `@backend/internal/transport/handlers/metrics_handler_test.go`:
- Around line 11-13: The test in metrics_handler_test.go is constructing a
NewHandler with all nil parameters, which bypasses testing the handler's
dependency contracts. Replace the three nil arguments in the NewHandler(nil,
nil, nil) call with appropriate mock interfaces or stubs that satisfy the
handler's use-case dependencies. This ensures the unit test properly exercises
the handler with mocked dependencies as per the coding guideline that handler
unit tests should mock Use case interfaces rather than relying on nil values.
In `@backend/internal/transport/handlers/routes.go`:
- Line 45: The `/metrics` endpoint registration at line 45 in the routes file
currently exposes the metrics handler without any access control in all
environments. Add environment-based conditional logic around the
r.GET("/metrics", gin.WrapH(promhttp.Handler())) route registration to either
skip it entirely in production or wrap it with authentication and/or
trusted-network middleware that restricts access to internal callers only. This
prevents information disclosure through metrics exposure in production
deployments.
---
Nitpick comments:
In `@backend/.env.example`:
- Around line 21-25: The dotenv-linter has flagged a missing trailing blank line
in the backend/.env.example file. Add a blank line at the end of the file after
the GRAFANA_ADMIN_PASSWORD=admin line to satisfy linter expectations and
maintain consistency with file formatting conventions. The current logical
grouping by feature is acceptable and does not need to be changed to strict
alphabetical order.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d6638f93-bed9-46f3-9a84-d3ce63b07826
⛔ Files ignored due to path filters (1)
backend/go.sumis excluded by!**/*.sum
📒 Files selected for processing (19)
backend/.env.examplebackend/docker-compose.ymlbackend/docs/environment.mdbackend/docs/observability.mdbackend/docs/swagger/docs.gobackend/docs/swagger/swagger.jsonbackend/docs/swagger/swagger.yamlbackend/go.modbackend/grafana/provisioning/dashboards/backend.jsonbackend/grafana/provisioning/dashboards/provider.ymlbackend/grafana/provisioning/datasources/prometheus.ymlbackend/internal/infrastructure/database/postgres/db_metrics.gobackend/internal/server/server.gobackend/internal/transport/handlers/metrics_handler.gobackend/internal/transport/handlers/metrics_handler_test.gobackend/internal/transport/handlers/routes.gobackend/internal/transport/middleware/metrics.gobackend/internal/transport/middleware/metrics_test.gobackend/prometheus.yml
- NewServer now returns error so collector registration failures bubble
up to main rather than being silently swallowed
- /metrics is restricted to loopback/RFC-1918 IPs in staging/production
via a new LocalNetworkOnly middleware; unrestricted in debug mode
- metrics handler test uses &Handler{} zero value instead of nil deps,
consistent with the existing hello_handler_test.go pattern
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
/metricsendpoint viapromhttp.Handler()(unauthenticated, excluded from rate limiting instrumentation)PrometheusMiddlewareintransport/middlewarerecordinghttp_requests_totalandhttp_request_duration_secondsfor every route except/metricsitselfdbStatsCollectorininfrastructure/database/postgresthat exports ninesql.DBStatsfields to Prometheus on each scrapeprometheus(port 9090) andgrafana(port 3001) intodocker-compose.ymlalongside the existing Postgres serviceGRAFANA_ADMIN_USER/GRAFANA_ADMIN_PASSWORDdocumented in.env.examplewithadmindefaultsobservability.md(wrongRegisterRoutessignature) andenvironment.md(missingREDIS_URLandBLUEPRINT_WS_ALLOWED_ORIGIN)Closes #17
Test plan
go vet ./...— cleango test ./internal/transport/...— passes (3 new middleware tests + metrics endpoint test)docker compose upstarts Postgres, Prometheus, and Grafana without errorsGET http://localhost:8080/metricsreturns 200 with Prometheus exposition formathttp://localhost:3001shows the backend dashboard with live data after a few requestsSummary by CodeRabbit
New Features
/metricsendpoint for Prometheus data collection.Documentation