Skip to content

Added delta local accumulation inside thread cleanup loop#7086

Merged
hkaiser merged 1 commit into
TheHPXProject:masterfrom
toxicteddy00077:fea/batch-atomic-cleanup
Mar 25, 2026
Merged

Added delta local accumulation inside thread cleanup loop#7086
hkaiser merged 1 commit into
TheHPXProject:masterfrom
toxicteddy00077:fea/batch-atomic-cleanup

Conversation

@toxicteddy00077
Copy link
Copy Markdown
Contributor

@toxicteddy00077 toxicteddy00077 commented Mar 22, 2026

Description

Closes #7085

continuing in spirit of #7056, the changes are very simple, I have just repalced the repeated per-item atomic decrements with local counters within the loop and and a fetch_sub after the loop. Basically two bulk atomic operations isnteads of many.

@StellarBot
Copy link
Copy Markdown
Collaborator

Can one of the admins verify this patch?

@toxicteddy00077 toxicteddy00077 force-pushed the fea/batch-atomic-cleanup branch from c0f3fd4 to 990890b Compare March 22, 2026 17:25
@toxicteddy00077
Copy link
Copy Markdown
Contributor Author

Benchmarks for comparison, left is baseline, right is my imrpovement, very minimal speedups varying form 21-7% across API
image

hkaiser
hkaiser previously approved these changes Mar 22, 2026
Copy link
Copy Markdown
Contributor

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM, thanks!

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Mar 22, 2026

/ok to test

@hkaiser hkaiser added this to the 2.0.0 milestone Mar 22, 2026
@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Mar 22, 2026

@charan-003 please have a look here. The CIs are still not being triggered.

@charan-003
Copy link
Copy Markdown
Contributor

@charan-003 please have a look here. The CIs are still not being triggered.

@hkaiser

looks like the /ok to test workflows are getting stuck in queue

sad to say but we can either:

  1. revert the changes in the PR i created on workflows and make it as it used to be earlier
  2. keep it but the queuing is because we have 40+ parallel test jobs running at same time
    even if we do something like https://github.com/marketplace/actions/verify-safe-to-test-label it won't fix the queuing - the issue is too many parallel jobs, not how they're triggered

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Mar 22, 2026

@charan-003 please have a look here. The CIs are still not being triggered.

@hkaiser

looks like the /ok to test workflows are getting stuck in queue

sad to say but we can either:

  1. revert the changes in the PR i created on workflows and make it as it used to be earlier
  2. keep it but the queuing is because we have 40+ parallel test jobs running at same time
    even if we do something like https://github.com/marketplace/actions/verify-safe-to-test-label it won't fix the queuing - the issue is too many parallel jobs, not how they're triggered

Things were run before (even with the queuing). The difference is that now github reports those CI jobs a skipped. Let's revert the change for now until we have a better understanding of how to do what you suggested.

@charan-003 could you create a PR that reverts it, please? It can't be automatically reverted anymore because of the second PR that modified things.

@charan-003
Copy link
Copy Markdown
Contributor

charan-003 commented Mar 22, 2026

@charan-003 please have a look here. The CIs are still not being triggered.

@hkaiser
looks like the /ok to test workflows are getting stuck in queue
sad to say but we can either:

  1. revert the changes in the PR i created on workflows and make it as it used to be earlier
  2. keep it but the queuing is because we have 40+ parallel test jobs running at same time
    even if we do something like https://github.com/marketplace/actions/verify-safe-to-test-label it won't fix the queuing - the issue is too many parallel jobs, not how they're triggered

Things were run before (even with the queuing). The difference is that now github reports those CI jobs a skipped. Let's revert the change for now until we have a better understanding of how to do what you suggested.

@charan-003 could you create a PR that reverts it, please? It can't be automatically reverted anymore because of the second PR that modified things.

let me do that quickly

@TheHPXProject TheHPXProject deleted a comment from github-actions Bot Mar 22, 2026
Comment thread libs/core/schedulers/include/hpx/schedulers/thread_queue.hpp Fixed
Comment thread libs/core/schedulers/include/hpx/schedulers/thread_queue.hpp Fixed
@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Mar 23, 2026

@toxicteddy00077 Could you please take care of the clang-format issues reported?

Comment thread libs/core/schedulers/include/hpx/schedulers/thread_queue.hpp Dismissed
Comment thread libs/core/schedulers/include/hpx/schedulers/thread_queue.hpp Dismissed
@StellarBot
Copy link
Copy Markdown
Collaborator

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)(=)---

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T14:08:29+00:002026-03-23T12:47:16+00:00
HPX Commit0eeca86b8383e0
Datetime2026-03-09T09:15:24.034803-05:002026-03-23T10:34:57.975114-05:00
Envfile
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch+++

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T14:08:29+00:002026-03-23T12:47:16+00:00
HPX Commit0eeca86b8383e0
Datetime2026-03-09T09:17:15.638328-05:002026-03-23T10:36:36.098640-05:00
Envfile
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add=(=)---
Stream Benchmark - Scale(=)-----
Stream Benchmark - Triad(=)----
Stream Benchmark - Copy(=)+++---

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T18:50:37+00:002026-03-23T12:47:16+00:00
HPX Commitba89f5db8383e0
Datetime2026-03-09T17:49:10.837937-05:002026-03-23T10:37:09.849591-05:00
Envfile
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

Comment thread bench_results/pr2_batch_atomic_results.txt Outdated
Signed-off-by: Aryamaan Singh <aryamaan.singh05@gmail.com>
@toxicteddy00077 toxicteddy00077 force-pushed the fea/batch-atomic-cleanup branch from 8f194de to 933145a Compare March 24, 2026 03:04
@toxicteddy00077 toxicteddy00077 requested a review from hkaiser March 24, 2026 03:04
@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Mar 24, 2026

retest lsu

@hkaiser
Copy link
Copy Markdown
Contributor

hkaiser commented Mar 24, 2026

Could you please rebase onto master to fix the compilation errors reported? Sorry for the inconvenience.

@StellarBot
Copy link
Copy Markdown
Collaborator

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)(=)---

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T14:08:29+00:002026-03-24T03:04:23+00:00
HPX Commit0eeca866a94eed
Envfile
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2026-03-09T09:15:24.034803-05:002026-03-24T18:44:33.993386-05:00
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch+++

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T14:08:29+00:002026-03-24T03:04:23+00:00
HPX Commit0eeca866a94eed
Envfile
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2026-03-09T09:17:15.638328-05:002026-03-24T18:46:11.438824-05:00
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)---
Stream Benchmark - Scale(=)-----
Stream Benchmark - Triad(=)----
Stream Benchmark - Copy(=)+++---

Info

PropertyBeforeAfter
HPX Datetime2026-03-09T18:50:37+00:002026-03-24T03:04:23+00:00
HPX Commitba89f5d6a94eed
Envfile
Compiler/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8/opt/apps/llvm/18.1.8/bin/clang++ 18.1.8
Datetime2026-03-09T17:49:10.837937-05:002026-03-24T18:46:45.584776-05:00
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@hkaiser hkaiser merged commit 43d1ddd into TheHPXProject:master Mar 25, 2026
136 of 170 checks passed
vaibhavj2006 pushed a commit to vaibhavj2006/hpx that referenced this pull request Mar 30, 2026
…atomic-cleanup

Added delta local accumulation inside thread cleanup loop
Signed-off-by: Vaibhav Joshi <joshivaibhav553@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

thread_queue: batch the atomic decrement in cleanup_terminated_locked()

5 participants