fix: improve Prometheus metrics naming conventions per best practices#2
Merged
Merged
Conversation
Repository owner
deleted a comment from
the3mi
Mar 6, 2026
Repository owner
deleted a comment from
the3mi
Mar 6, 2026
juliusv
reviewed
Mar 6, 2026
| ch <- prometheus.MustNewConstMetric(agentSessionsDesc, prometheus.GaugeValue, float64(sessions), name) | ||
| ch <- prometheus.MustNewConstMetric(agentStateDesc, prometheus.GaugeValue, StateMap[state], name) | ||
| ch <- prometheus.MustNewConstMetric(agentLastActivityDesc, prometheus.GaugeValue, secondsAgo, name) | ||
| ch <- prometheus.MustNewConstMetric(agentLastActivityTimestampDesc, prometheus.GaugeValue, float64(time.Now().Unix())-secondsAgo, name) |
There was a problem hiding this comment.
In getAgentState you first do secondsAgo := time.Since(latest.modTime).Seconds(), and then you convert it back again. Maybe even cleaner to just return the timestamp to begin with?
Owner
There was a problem hiding this comment.
Good point! Fixed in v0.4.1 — now returns the timestamp directly instead of the roundtrip conversion. https://github.com/SammyLin/openclaw-exporter/releases/tag/v0.4.1
Repository owner
deleted a comment from
the3mi
Mar 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves Prometheus metric naming across
cron.go,agent.go, andworkspace.goto align with Prometheus naming best practices.Changes Made
A)
collector/cron.go— Cron Job Metrics❌ Removed:
openclaw_cron_job_last_run_age_secondsReason: Prometheus best practices recommend exposing Unix timestamps (which are stable and can be used to compute age at query time) rather than pre-computed age values that become stale between scrapes. The existing
openclaw_cron_job_last_run_at_secondsalready provides the Unix timestamp, making the age metric redundant.Migration: Use
time() - openclaw_cron_job_last_run_at_secondsin PromQL.❌ Removed:
openclaw_cron_job_next_run_in_secondsReason: Same rationale — pre-computed countdown values are redundant when
openclaw_cron_job_next_run_at_seconds(Unix timestamp) is already available.Migration: Use
openclaw_cron_job_next_run_at_seconds - time()in PromQL.🔄 Renamed:
openclaw_cron_job_last_duration_ms→openclaw_cron_job_last_duration_secondsReason: Prometheus convention is to use base units (seconds for time, bytes for data). Millisecond suffixes are non-standard and require consumers to mentally convert. The value is now divided by 1000 to convert from ms to seconds.
B)
collector/agent.go— Agent Metrics🔄 Renamed:
openclaw_agent_last_activity_seconds→openclaw_agent_last_activity_timestamp_secondsReason: The metric was previously an age (seconds since last activity), which is an unstable value that drifts between scrapes. Changed to export a Unix timestamp instead (via
time.Now().Unix() - secondsAgo). The_timestamp_secondssuffix is the Prometheus-recommended convention for Unix timestamps (used by e.g.process_start_time_seconds).Migration: Use
time() - openclaw_agent_last_activity_timestamp_secondsin PromQL to get seconds since last activity.C)
collector/workspace.go— Workspace Metrics🔄 Renamed:
openclaw_md_workspace_total_bytes→openclaw_md_workspace_bytesReason: The
total_prefix is redundant when the metric already represents an aggregate value perworkspacelabel. Prometheus naming guidelines discourage redundant words in metric names.🔄 Renamed:
openclaw_md_workspace_total_tokens_estimated→openclaw_md_workspace_tokens_estimatedReason: Same rationale as above —
total_prefix dropped.Files Updated
collector/cron.go— Removed 2 redundant metrics, renamed duration metric, removed unusedtimeimportcollector/agent.go— Renamed metric, changed value to Unix timestampcollector/workspace.go— Renamed 2 workspace aggregate metricsREADME.md— Updated metric tableREADME.zh-TW.md— Updated metric table (Traditional Chinese)deploy/grafana/dashboards/openclaw-complete.json— Updated agent activity query totime() - openclaw_agent_last_activity_timestamp_secondsdeploy/grafana/dashboards/token-usage.json— Updated workspace tokens query to new metric nameopenclaw_cron_job_last_run_age_secondstime() - openclaw_cron_job_last_run_at_secondsopenclaw_cron_job_next_run_in_secondsopenclaw_cron_job_next_run_at_seconds - time()openclaw_cron_job_last_duration_msopenclaw_cron_job_last_duration_seconds(value ÷ 1000)openclaw_agent_last_activity_secondstime() - openclaw_agent_last_activity_timestamp_secondsopenclaw_md_workspace_total_bytesopenclaw_md_workspace_bytesopenclaw_md_workspace_total_tokens_estimatedopenclaw_md_workspace_tokens_estimatedPromQL Migration Examples