feat(telemetry): BusinessMetrics — transparently tenant-scoped instruments#669
Conversation
…ments The standard factory for product/business metrics. Counter/FloatCounter/ Histogram/Gauge/FloatGauge wrappers merge TenantAttributes(ctx) into every measurement, so call sites cannot forget tenant attribution — the existing opt-in WithTenantAttributes pattern proved forgettable (23 fintech counters shipped tenant-less). Claim-less system paths record without tenant attributes; explicit per-call attributes are preserved. Instrument-creation errors fall back to noop instruments (metrics never break the service), matching LatencyMeasure behaviour.
There was a problem hiding this comment.
Code Review
This pull request introduces a new BusinessMetrics factory in telemetry/business.go along with corresponding tests in telemetry/business_test.go. The factory transparently attaches tenant and partition attributes from the context to various metric types (Counter, FloatCounter, Histogram, Gauge, FloatGauge). The review feedback suggests optimizing the merged function to avoid unnecessary slice allocations when either tenant or explicit attributes are empty. Additionally, it recommends expanding the test suite to fully cover and verify the newly introduced Gauge and FloatGauge metrics.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| func merged(ctx context.Context, attrs []attribute.KeyValue) metric.MeasurementOption { | ||
| tenant := TenantAttributes(ctx) | ||
| if len(attrs) == 0 && len(tenant) == 0 { | ||
| return metric.WithAttributes() | ||
| } | ||
| all := make([]attribute.KeyValue, 0, len(tenant)+len(attrs)) | ||
| all = append(all, tenant...) | ||
| all = append(all, attrs...) | ||
| return metric.WithAttributes(all...) | ||
| } |
There was a problem hiding this comment.
On high-throughput hot paths, minimizing slice allocations is crucial for performance. The current implementation of merged always allocates a new slice all when either tenant or attrs is non-empty, even if one of them is completely empty.
We can optimize this by returning metric.WithAttributes directly with the non-empty slice when the other is empty, avoiding any new slice allocation in those very common scenarios.
func merged(ctx context.Context, attrs []attribute.KeyValue) metric.MeasurementOption {
tenant := TenantAttributes(ctx)
if len(tenant) == 0 {
if len(attrs) == 0 {
return metric.WithAttributes()
}
return metric.WithAttributes(attrs...)
}
if len(attrs) == 0 {
return metric.WithAttributes(tenant...)
}
all := make([]attribute.KeyValue, 0, len(tenant)+len(attrs))
all = append(all, tenant...)
all = append(all, attrs...)
return metric.WithAttributes(all...)
}| func dataPointAttrs(data metricdata.Aggregation) []attribute.Set { | ||
| switch d := data.(type) { | ||
| case metricdata.Sum[int64]: | ||
| out := make([]attribute.Set, 0, len(d.DataPoints)) | ||
| for _, dp := range d.DataPoints { | ||
| out = append(out, dp.Attributes) | ||
| } | ||
| return out | ||
| case metricdata.Sum[float64]: | ||
| out := make([]attribute.Set, 0, len(d.DataPoints)) | ||
| for _, dp := range d.DataPoints { | ||
| out = append(out, dp.Attributes) | ||
| } | ||
| return out | ||
| case metricdata.Histogram[float64]: | ||
| out := make([]attribute.Set, 0, len(d.DataPoints)) | ||
| for _, dp := range d.DataPoints { | ||
| out = append(out, dp.Attributes) | ||
| } | ||
| return out | ||
| default: | ||
| return nil | ||
| } | ||
| } |
There was a problem hiding this comment.
The helper function dataPointAttrs currently only handles Sum and Histogram aggregations. It does not support Gauge aggregations, which means any future tests for Gauge or FloatGauge would return nil attributes and fail to verify tenant-scoping.
Adding support for metricdata.Gauge[int64] and metricdata.Gauge[float64] makes this helper complete and robust.
func dataPointAttrs(data metricdata.Aggregation) []attribute.Set {
switch d := data.(type) {
case metricdata.Sum[int64]:
out := make([]attribute.Set, 0, len(d.DataPoints))
for _, dp := range d.DataPoints {
out = append(out, dp.Attributes)
}
return out
case metricdata.Sum[float64]:
out := make([]attribute.Set, 0, len(d.DataPoints))
for _, dp := range d.DataPoints {
out = append(out, dp.Attributes)
}
return out
case metricdata.Histogram[float64]:
out := make([]attribute.Set, 0, len(d.DataPoints))
for _, dp := range d.DataPoints {
out = append(out, dp.Attributes)
}
return out
case metricdata.Gauge[int64]:
out := make([]attribute.Set, 0, len(d.DataPoints))
for _, dp := range d.DataPoints {
out = append(out, dp.Attributes)
}
return out
case metricdata.Gauge[float64]:
out := make([]attribute.Set, 0, len(d.DataPoints))
for _, dp := range d.DataPoints {
out = append(out, dp.Attributes)
}
return out
default:
return nil
}
}| bm := telemetry.NewBusinessMetrics("biz-test") | ||
| created := bm.Counter("things_created_total", "things created") | ||
| amount := bm.FloatCounter("things_amount_total", "amount of things") | ||
| dur := bm.Histogram("things_duration_ms", "thing duration") | ||
|
|
||
| ctxA := tenantContext("tenant-a", "part-a") | ||
| created.Add(ctxA, 1, attribute.String("kind", "gold")) | ||
| amount.Add(ctxA, 12.5) | ||
| dur.Record(ctxA, 42.0) |
There was a problem hiding this comment.
The test currently only verifies Counter, FloatCounter, and Histogram. It should also verify that Gauge and FloatGauge are correctly created and transparently carry the tenant attributes.
| bm := telemetry.NewBusinessMetrics("biz-test") | |
| created := bm.Counter("things_created_total", "things created") | |
| amount := bm.FloatCounter("things_amount_total", "amount of things") | |
| dur := bm.Histogram("things_duration_ms", "thing duration") | |
| ctxA := tenantContext("tenant-a", "part-a") | |
| created.Add(ctxA, 1, attribute.String("kind", "gold")) | |
| amount.Add(ctxA, 12.5) | |
| dur.Record(ctxA, 42.0) | |
| bm := telemetry.NewBusinessMetrics("biz-test") | |
| created := bm.Counter("things_created_total", "things created") | |
| amount := bm.FloatCounter("things_amount_total", "amount of things") | |
| dur := bm.Histogram("things_duration_ms", "thing duration") | |
| gauge := bm.Gauge("things_gauge", "gauge of things") | |
| fgauge := bm.FloatGauge("things_fgauge", "float gauge of things") | |
| ctxA := tenantContext("tenant-a", "part-a") | |
| created.Add(ctxA, 1, attribute.String("kind", "gold")) | |
| amount.Add(ctxA, 12.5) | |
| dur.Record(ctxA, 42.0) | |
| gauge.Record(ctxA, 100) | |
| fgauge.Record(ctxA, 100.5) |
| for _, name := range []string{"things_amount_total", "things_duration_ms"} { | ||
| s := attrSets(rm, name) | ||
| require.NotEmpty(t, s, name) | ||
| v, ok := s[0].Value("tenant_id") | ||
| require.True(t, ok, name+" must carry tenant_id") | ||
| require.Equal(t, "tenant-a", v.AsString()) | ||
| } |
There was a problem hiding this comment.
Include the newly added things_gauge and things_fgauge in the verification loop to ensure they correctly carry the tenant attributes.
| for _, name := range []string{"things_amount_total", "things_duration_ms"} { | |
| s := attrSets(rm, name) | |
| require.NotEmpty(t, s, name) | |
| v, ok := s[0].Value("tenant_id") | |
| require.True(t, ok, name+" must carry tenant_id") | |
| require.Equal(t, "tenant-a", v.AsString()) | |
| } | |
| for _, name := range []string{"things_amount_total", "things_duration_ms", "things_gauge", "things_fgauge"} { | |
| s := attrSets(rm, name) | |
| require.NotEmpty(t, s, name) | |
| v, ok := s[0].Value("tenant_id") | |
| require.True(t, ok, name+" must carry tenant_id") | |
| require.Equal(t, "tenant-a", v.AsString()) | |
| } |
Standard factory for business metrics where tenant_id/partition_id attach automatically from the security claims context on every Add/Record. Foundation for the fleet-wide analytics standardization. Test proves: tenant attrs present with claims, absent without, explicit attrs preserved.