Added Persistent Collectives to Develop by bienz2 · Pull Request #70 · mpi-advance/locality_aware

bienz2 · 2026-06-12T22:22:25Z

Merged persistent allreduce, allgather, alltoall, and alltoallv. Also merged the GPU-Aware and Copy-to-CPU methods for each. There are tests for the standard CPU implementations as well as GPU versions. All tests passed on Tuolumne.

…to develop

TheMasterDirk

A couple of minor changes, and a thought on your duplicated communicators.

I did not check the algorithms themselves, as I trust that you've got them working with the tests you added.

TheMasterDirk · 2026-06-15T14:39:42Z

+#if defined(GPU_AWARE)
+    ALLTOALL_INIT_GPU_PAIRWISE,
+    ALLTOALL_INIT_GPU_NONBLOCKING,
+#if defined(MPI4)


Is this (MPI4) something we define?

TheMasterDirk · 2026-06-15T14:46:36Z

+    int gpu_error;
+if (request->gpu_sendbuf)
+{
+#if defined(APU)


Is this (APU) added anywhere in the CMake code?

TheMasterDirk · 2026-06-15T14:58:49Z

+
+    if (local_S_request->n_msgs)
+    {
+        MPI_Waitall(local_S_request->n_msgs, local_S_request->requests, MPI_STATUSES_IGNORE);             MPI_Reduce_local(request->tmpbuf, request->recvbuf, request->count,


File could use a formatting pass. Looks like this line didn't get it on its own line

TheMasterDirk · 2026-06-15T15:04:29Z

+    bool gpu_aware = false;
+    bool copy_to_cpu = false;


Just a minor tweak to fix compile warnings in non-gpu mode

Suggested change

bool gpu_aware = false;

bool copy_to_cpu = false;

#if defined(GPU)

bool gpu_aware = false;

bool copy_to_cpu = false;

#endif

TheMasterDirk · 2026-06-15T15:04:49Z

+    bool gpu_aware = false;
+    bool copy_to_cpu = false;


Suggested change

bool gpu_aware = false;

bool copy_to_cpu = false;

#if defined(GPU)

bool gpu_aware = false;

bool copy_to_cpu = false;

#endif

TheMasterDirk · 2026-06-15T15:05:09Z

+    bool gpu_aware = false;
+    bool copy_to_cpu = false;


Suggested change

bool gpu_aware = false;

bool copy_to_cpu = false;

#if defined(GPU)

bool gpu_aware = false;

bool copy_to_cpu = false;

#endif

TheMasterDirk · 2026-06-15T15:05:23Z

+    bool gpu_aware = false;
+    bool copy_to_cpu = false;


Suggested change

bool gpu_aware = false;

bool copy_to_cpu = false;

#if defined(GPU)

bool gpu_aware = false;

bool copy_to_cpu = false;

#endif

TheMasterDirk · 2026-06-15T15:15:19Z

-    if (request->cpu_sendbuf)
+    // Added with MPI_Comm_dup, so need to free
+    // TODO: now that we have cached communicators, 
+    // can we avoid this dup??


Is global_comm even needed in the request class? I don't see it used other than the dup and the free. Unless you plan to use it in other algorithms, this global_comm can be removed`.

If it is needed, then I think the dup global_comm can be avoided if we turn MPIL_COMM::global_comm into the new CachedComm type, and then also using the same type in MPIL_Request::global_comm. Let me know if there's a different rationale for why it might need to be duped.

TheMasterDirk · 2026-06-15T15:23:20Z

+        MPI_Comm_free(&(request->global_comm));
    }
-    if (request->cpu_recvbuf)
+    if (request->local_comm != MPI_COMM_NULL)


(separate comment for local_comm)

For local comm, I think a similar approach could be used if you changed the type of the of local_comm in allreduce_init_dissemination_loc_core to CachedComm, as then you would not need to duplicate it and can just use the reference counted MPI_Comm wrapper (CachedComm).

This should work as long as anything calling allreduce_init_dissemination_loc_core is providing a communicator we create. Do you expect this to always be the case?

If it's not, then we may need to create a smarter caching mechanism

bienz2 and others added 18 commits June 11, 2026 15:04

Added persistent collectives

b8b5678

Merge branch 'develop' of https://github.com/bienz2/locality_aware in…

98288e4

…to develop

fixed bugs

043ac97

Fixing bugs

4d4f8ca

fixing bugs

0de5e68

adding persistent collective tests

5a93d4e

Adding persistent tests

b8edb82

fixing bugs

68f599d

fixing bugs'

8be8031

Passing all CPU tests

0dc3eb2

Adding gpu persistent collective tests

b5f40b8

fixed small bugs

c237b63

fixed c2c allgather_init bug

60c0c5c

Merge branch 'mpi-advance:develop' into develop

1764d2a

Fixing gpu warnings

1cf2fa3

Fixing gpu warnings

ebed48c

Fixing gpu warnings

9184369

Fixed GPU warnings

ea0c96b

bienz2 requested a review from TheMasterDirk June 12, 2026 22:22

TheMasterDirk requested changes Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added Persistent Collectives to Develop#70

Added Persistent Collectives to Develop#70
bienz2 wants to merge 18 commits into
mpi-advance:developfrom
bienz2:develop

bienz2 commented Jun 12, 2026

Uh oh!

TheMasterDirk left a comment

Uh oh!

TheMasterDirk Jun 15, 2026 •

edited

Loading

Uh oh!

TheMasterDirk Jun 15, 2026 •

edited

Loading

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

TheMasterDirk Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bienz2 commented Jun 12, 2026

Uh oh!

TheMasterDirk left a comment

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

TheMasterDirk Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TheMasterDirk Jun 15, 2026 •

edited

Loading

TheMasterDirk Jun 15, 2026 •

edited

Loading