Skip to content

Add optional Prometheus metrics endpoint for runner scale set statistics#62

Merged
ispasov merged 4 commits intomacstadium:mainfrom
ccang-riot:feature/expose-prometheus-metrics
Apr 9, 2026
Merged

Add optional Prometheus metrics endpoint for runner scale set statistics#62
ispasov merged 4 commits intomacstadium:mainfrom
ccang-riot:feature/expose-prometheus-metrics

Conversation

@ccang-riot
Copy link
Copy Markdown
Contributor

This PR adds Prometheus metrics for runner scale set statistics and exposes them via /metrics when enabled.
Included metrics:


total available jobs
total acquired jobs
total assigned jobs
total running jobs
total registered runners
total busy runners
total idle runners

Implementation details:

metrics are opt-in via ENABLE_METRICS
metrics endpoint address is configurable via METRICS_ADDR
metrics polling interval is configurable via METRICS_POLL_INTERVAL
metrics use a dedicated Prometheus registry instead of the global registry
metrics include a runner_name label to identify the runner scale set

@ccang-riot ccang-riot requested a review from a team as a code owner April 6, 2026 23:14
Comment thread main.go Outdated

var runnerScaleSetIDs = []int{}

type Metrics struct {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest moving the metrics related code to a separate file (and type). This would keep the main file relatively small and the metrics ligic can be concentrated in one place.

In addition to that the start of the metrics sever together with the metrics polling can be done in the same func (for example called Start).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I’ve refactored the metrics implementation accordingly.

Moved all metrics-related logic into a dedicated metrics package (pkg/metrics/metrics.go)
Introduced a Start function that encapsulates both:

  • the metrics HTTP server
  • the metrics polling loop

Simplified main.go so it only handles orchestration and conditionally enables metrics via ENABLE_METRICS

This keeps main.go smaller and centralizes all metrics behavior in one place for better separation of concerns.

Comment thread pkg/env/constants.go
LogLevelEnvName = "LOG_LEVEL"

// Prometheus metrics
EnableMetricsEnvName = "ENABLE_METRICS"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things I've missed the first time:

  1. Let's update the readme with these new env vars explaining what they are
  2. Let's add them to the example .env file

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated! 🎉

@ispasov
Copy link
Copy Markdown
Collaborator

ispasov commented Apr 8, 2026

Tested the change and it works perfectly.
Let's update the readme and the .env example file and we are good to go.

Copy link
Copy Markdown
Collaborator

@ispasov ispasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution.
This looks great!

@ispasov ispasov merged commit 16a9f78 into macstadium:main Apr 9, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants