perf(metrics): partition ValueMap into per-thread shards to reduce lock contention by souyang-dev · Pull Request #3473 · open-telemetry/opentelemetry-rust

souyang-dev · 2026-04-28T21:22:55Z

Metrics ValueMap is protected by a RwLock. It can be a source of contention under high concurrency. What's more, the cost to update a histogram is even higher because you need to grab a mutex lock after the read lock before you can update a bucket.
There is a previous attempt to shard the ValueMap (#1564), but that PR partitions metrics updates based on attribute-sets, therefore updates to the same attribute-set still land on the same shard, so it's not effective in reducing locking contention when most updates are focused on a small set of attributes.

Fixes #
This PR partitions the single HashMap into shards. Each thread update different shards based on thread id to minimize locking contention. If the number of shards >= number of CPUs you won't see locking contentions when updating metrics.

Here are Benchcmark results (metrics_histogram, metrics) with AMD EPYC 9555 CPUs (255 CPU threads).
The numbers in the tables are updates/second in millions. "shards=16" is this PR with 16 shards, "shards=32" is this PR with 32 shards.

Counter update throughput (cargo run --release --package stress --bin metrics -- N)

number of threads	1	2	4	8	16
upstream	10	5.7	6.8	6	5.5
shards=16	10	22	32	40	43
shards=32	10	22	31	48	100

Histogram update throughput (cargo run --release --package stress --bin metrics_histogram -- N)

number of threads	1	2	4	8	16
upstream	9.5	4.8	5.2	6	5.5
shards=16	9.5	19	28	40	42
shards=32	9.5	19	28	45	84

Changes

Two changes are made:
1, Partition ValueMap into shards and let different threads update different shards to minimize locking contention.
2, Implement merge function for Aggregators, so collector can merge the shards.

Merge requirement checklist

[* ] CONTRIBUTING guidelines followed
[* ] Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

linux-foundation-easycla · 2026-04-28T21:23:02Z

The committers listed above are authorized under a signed CLA.

✅ login: souyang-dev / name: Shawn (4aa7aa9, 6dfac19, 9676b42, ff20752)

codecov · 2026-04-28T21:27:16Z

Codecov Report

❌ Patch coverage is 93.90756% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.9%. Comparing base (c992379) to head (ff20752).

Files with missing lines	Patch %	Lines
...-sdk/src/metrics/internal/exponential_histogram.rs	92.5%	19 Missing ⚠️
opentelemetry-sdk/src/metrics/internal/mod.rs	94.9%	10 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #3473     +/-   ##
=======================================
+ Coverage   83.7%   83.9%   +0.1%     
=======================================
  Files        126     126             
  Lines      25386   25784    +398     
=======================================
+ Hits       21255   21635    +380     
- Misses      4131    4149     +18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

scottgerring · 2026-04-30T14:54:29Z

            count: AtomicUsize::new(0),
            config,
-            cardinality_limit,
+            all_shards_cardinality_limit: per_shard_capacity.saturating_mul(n),


Pinging @cijothomas because metrics is really his area :) !

I'm not sure about this part but it might be a gap in my mental model. The metrics SDK spec defines aggregation_cardinality_limit as "the maximum number of data points allowed to be emitted in a collection cycle by a single instrument" and says that, for a given metric, the cardinality limit is a "hard limit" on the number of metric points collected during a collection cycle:

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#cardinality-limits

It looks like with this change, the effective limit becomes roughly configured_limit * num_shards where num_shards is the smaller of CPU count or 64, so a configured limit of 2000 can produce many more than 2000 points.

I think sharding needs to remain an internal implementation detail and preserve the configured stream cardinality limit. The user-visible impact is that a limit chosen to cap SDK memory usage and downstream time-series cardinality would no longer provide that cap; the same configuration could produce very different numbers of exported
series depending on CPU count.

Previous shard attempts were also blocked on this same problem. It is non-trivial to solve with sharding approach, unfortunately. We'd need to solve contention issue differently where consciously opts in to this model. (I'll write what I have in my mind soon)

Thank you @scottgerring for the comments.
I have updated the patch to enforce cardinality limit across all shards.

…ck contention Two changes are made: 1, Partition ValueMap into shards and let different threads update different shards to minimize locking contention. 2, Implement merge function for Aggregators, so collector can merge the shards.

Comply with otel spec: aggregation_cardinality_limit as "the maximum number of data points allowed to be emitted in a collection cycle by a single instrument".

Use only 1 shard if the aggregator needs to preserve order of update, so all updates will be serialized.

1, hold a shard write lock to protect flag swap and delta collection. 2, downscale both src and dest histogram to the same scale before doing merge.

souyang-dev requested a review from a team as a code owner April 28, 2026 21:22

souyang-dev force-pushed the metrics-lock-rebase branch 2 times, most recently from 30be6f1 to be5ab0b Compare April 28, 2026 22:02

souyang-dev changed the title ~~Partition ValueMap into per-thread shards to increase metrics update …~~ perf(metrics): partition ValueMap into per-thread shards to reduce lock contention Apr 28, 2026

souyang-dev force-pushed the metrics-lock-rebase branch 4 times, most recently from 7ab756e to 072ba19 Compare April 28, 2026 22:37

ArniDagur reviewed Apr 29, 2026

View reviewed changes

Comment thread opentelemetry-sdk/src/metrics/internal/mod.rs Outdated

Comment thread opentelemetry-sdk/src/metrics/internal/mod.rs Outdated

souyang-dev force-pushed the metrics-lock-rebase branch 3 times, most recently from f7aa92f to 6af0215 Compare April 29, 2026 01:28

scottgerring added the performance label Apr 30, 2026

scottgerring closed this Apr 30, 2026

scottgerring reopened this Apr 30, 2026

scottgerring reviewed Apr 30, 2026

View reviewed changes

scottgerring requested a review from cijothomas April 30, 2026 14:55

souyang-dev force-pushed the metrics-lock-rebase branch 3 times, most recently from 46111fc to f85e882 Compare May 1, 2026 00:36

cijothomas reviewed May 1, 2026

View reviewed changes

Comment thread opentelemetry-sdk/src/metrics/internal/last_value.rs

souyang-dev force-pushed the metrics-lock-rebase branch from afaf92e to f787b10 Compare May 1, 2026 04:57

souyang-dev requested a review from cijothomas May 1, 2026 05:14

lalitb reviewed May 1, 2026

View reviewed changes

Comment thread opentelemetry-sdk/src/metrics/internal/mod.rs Outdated

lalitb reviewed May 1, 2026

View reviewed changes

Comment thread opentelemetry-sdk/src/metrics/internal/exponential_histogram.rs Outdated

souyang-dev force-pushed the metrics-lock-rebase branch from f787b10 to 9640833 Compare May 4, 2026 05:48

souyang-dev added 2 commits May 4, 2026 13:25

fix(metrics): enforce spec-compliant cardinality limit across shards

4aa7aa9

Comply with otel spec: aggregation_cardinality_limit as "the maximum number of data points allowed to be emitted in a collection cycle by a single instrument".

fix(metrics): preserve order of update for gauge.

9676b42

Use only 1 shard if the aggregator needs to preserve order of update, so all updates will be serialized.

souyang-dev force-pushed the metrics-lock-rebase branch from 9640833 to 901e587 Compare May 4, 2026 20:28

souyang-dev requested a review from lalitb May 4, 2026 20:29

Fixed 2 issues in previous commit.

ff20752

1, hold a shard write lock to protect flag swap and delta collection. 2, downscale both src and dest histogram to the same scale before doing merge.

souyang-dev force-pushed the metrics-lock-rebase branch from 901e587 to ff20752 Compare May 4, 2026 21:09

souyang-dev mentioned this pull request May 5, 2026

Race condition between measure() and collect_and_reset() can produce spurious zero data point for delta Gauge #3470

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(metrics): partition ValueMap into per-thread shards to reduce lock contention#3473

perf(metrics): partition ValueMap into per-thread shards to reduce lock contention#3473
souyang-dev wants to merge 4 commits intoopen-telemetry:mainfrom
souyang-dev:metrics-lock-rebase

souyang-dev commented Apr 28, 2026 •

edited

Loading

Uh oh!

linux-foundation-easycla Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

scottgerring Apr 30, 2026

Uh oh!

cijothomas Apr 30, 2026

Uh oh!

souyang-dev May 1, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

souyang-dev commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Merge requirement checklist

Uh oh!

linux-foundation-easycla Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

scottgerring Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

cijothomas Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

souyang-dev May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

souyang-dev commented Apr 28, 2026 •

edited

Loading

linux-foundation-easycla Bot commented Apr 28, 2026 •

edited

Loading

codecov Bot commented Apr 28, 2026 •

edited

Loading