-
Notifications
You must be signed in to change notification settings - Fork 26
Prometheus Pull Consumer: emit dimensional labels for nested stats (natSourceTranslationStats) #288
Description
Is your feature request related to a problem? Please describe.
The Prometheus Pull Consumer flattens nested iControl REST stat paths into metric names, making per-object stats (like CGNAT natSourceTranslationStats) unusable in Grafana/Prometheus.
For example, with 20 CGNAT source-translation pools and a custom endpoint poller for /mgmt/tm/security/nat/source-translation/stats, the Pull Consumer produces 6,600+ metric lines where the pool name is concatenated into the metric name:
f5_natSourceTranslationStats__Common_POOL_DMA_FL_Miami_stats_pba_activePortBlocks 0
f5_natSourceTranslationStats__Common_POOL_DMA_FL_Miami_stats_pba_percentFreePortBlocks 100
f5_natSourceTranslationStats__Common_POOL_DMA_TX_Houston_stats_pba_activePortBlocks 5
This means:
- Every pool × every stat = a unique metric name (not a label)
- You cannot write PromQL queries like
f5_cgnat_pba_activePortBlocks{pool="Miami"}because the pool is not a label - Grafana dashboards cannot filter, aggregate, or alert across pools
- The volume of metric names causes significant cardinality overhead
This is the same class of problem described in #257 (virtualServers and clientSslProfiles labels), but for CGNAT/NAT source-translation stats specifically.
Describe the solution you'd like
The Prometheus Pull Consumer should support emitting Prometheus labels for known entity-level keys in the stat tree, instead of flattening them into the metric name.
Proposed output:
f5_cgnat_pba_activePortBlocks{pool="/Common/POOL_DMA_FL_Miami"} 0
f5_cgnat_pba_percentFreePortBlocks{pool="/Common/POOL_DMA_FL_Miami"} 100
f5_cgnat_lsn_activeTranslations{pool="/Common/POOL_DMA_FL_Miami"} 0
Implementation suggestion — opt-in via declaration:
Add a consumer-level option (e.g., useLabels or labelMode) so existing users are not affected:
{
"My_Pull_Consumer": {
"class": "Telemetry_Pull_Consumer",
"type": "Prometheus",
"systemPoller": ["My_Endpoint_Poller"],
"prometheusLabels": true
}
}When enabled, the Prometheus output formatter would:
- Detect entity-level keys in the stat tree (pool names, virtual server names, pool members, etc.)
- Extract them as Prometheus label key-value pairs
- Use a shorter, cleaner metric name for the stat itself
This could be implemented in src/lib/consumers/Prometheus/index.js (or equivalent current path) by modifying the recursive stat tree walker to recognize entity boundaries and emit labels instead of concatenating.
Describe alternatives you've considered
We have tested and deployed two workarounds:
1. Prometheus metric_relabel_configs (no middleware)
metric_relabel_configs:
- source_labels: [__name__]
regex: 'f5_natSourceTranslationStats__Common_(.+?)_stats_pba_(.+)'
target_label: pool
replacement: '${1}'
- source_labels: [__name__]
regex: 'f5_natSourceTranslationStats__Common_(.+?)_stats_pba_(.+)'
target_label: __name__
replacement: 'f5_cgnat_pba_${2}'Limitations: only handles Common partition, fragile regex, Prometheus still ingests all 6,600+ raw metrics before relabeling.
2. Custom Python exporter (bypass TS for CGNAT stats)
A standalone script that queries /mgmt/tm/security/nat/source-translation/stats directly via iControl REST and emits clean labeled metrics. Works well but requires running separate middleware.
Both are workarounds. The proper fix is in the TS Prometheus consumer itself.
Additional context
- BIG-IP version tested: 17.5.1.3
- TS version tested: 1.41.0
- Affected endpoints: Any endpoint returning nested per-object stats, including:
/mgmt/tm/security/nat/source-translation/stats(CGNAT pools — 45 stats per pool)/mgmt/tm/security/nat/policy/stats(CGNAT policies)/mgmt/tm/ltm/virtual/stats(virtual servers — related to Add virtualServers and clientSslProfiles labels to certain Telemetry Streaming metrics #257)/mgmt/tm/ltm/pool/statsand/mgmt/tm/ltm/pool/members/stats
- Scale: A single CGNAT BIG-IP with 20 pools generates 6,600+ unique metric names. Service providers may have 50-100+ pools per box.
- Customer impact: Service provider CGNAT deployments using Prometheus/Grafana for monitoring cannot effectively use TS for pool-level PBA stats without workarounds.
Related: #257