implement extensibility with NetworkBoundaryStrategy by kurtvolmar · Pull Request #1 · kurtvolmar/datafusion-distributed

kurtvolmar · 2026-02-11T18:45:00Z

This adds extensibility to the DistributedPhysicalOptimizerRule via a new trait called NetworkBoundaryStrategy.

The NetworkBoundaryStrategy was developed to provide a minimal set of changes for a fork which plug into the plan annotation and plan distribution phases of DistributedPhysicalOptimizerRule::optimize. While it is likely not the final design to propose upstream for introducing extensibility into the DistributedPhysicalOptimizerRule, it demonstrates key extensibility features that should be considered.

Ultimately users of the library should have a way to express mutations on the ExecutionPlan tree while plugging into the many niceties of datafusion-distributed. This would allow for users with specific use-cases, constraints, and data systems to customize the types of distributed physical plans they produce, while relying on the distributed workers, task estimation, network boundaries, metrics, etc. which datafusion-distributed provides.

kurtvolmar · 2026-02-11T18:46:21Z

+/// Strategy for placing network boundaries in a distributed execution plan.
+///
+/// When a network boundary is needed (e.g., after hash repartition or before coalesce),
+/// strategies are invoked in order. The first strategy to return annotation with a boundary wins.
+///
+/// Strategies should return `None` to defer to the next strategy in the chain.
+/// Custom strategies can be registered to override default behavior.
+pub trait NetworkBoundaryStrategy: Debug + Send + Sync {
+    /// Annotates a plan node with network boundary metadata.
+    ///
+    /// Returns `Some(NetworkBoundaryAnnotation)` if this strategy detects a boundary is needed,
+    /// or `None` to defer to the next strategy.
+    ///
+    /// The annotation can optionally include an `output_tasks` hint to override DFD's
+    /// default task count calculation.
+    fn annotate_network_boundary(
+        &self,
+        plan: &dyn ExecutionPlan,
+    ) -> Option<NetworkBoundaryAnnotation>;
+
+    /// Apply this strategy to place a network boundary. Return `Ok(None)` to defer to next strategy.
+    fn apply_boundary(
+        &self,
+        context: &NetworkBoundaryContext<'_>,
+    ) -> Result<Option<Arc<dyn ExecutionPlan>>>;
+}


Here is the primary extensibility trait we have introduced to control annotation and plan mutation.

kurtvolmar · 2026-02-11T18:47:32Z

            Self::Shuffle => write!(f, "[NetworkBoundary] Shuffle"),
            Self::Coalesce => write!(f, "[NetworkBoundary] Coalesce"),
            Self::Broadcast => write!(f, "[NetworkBoundary] Broadcast"),
+            Self::Extension(name) => write!(f, "[NetworkBoundary] Extension({})", name), // @NetworkBoundaryStrategy: Extension variant


This adds and PlanOrNetworkBoundary::Extension that allows for custom annotations that can be used in the distribute_plan phase.

kurtvolmar · 2026-02-11T18:50:31Z

        }
    }

+    // @NetworkBoundaryStrategy: strategies last—overwrite annotation when a strategy matches


Since this extensibility was designed for a fork, we chose to not implement NetworkBoundaryStrategys for theNetworkShuffleExec, NetworkCoalesceExec, and NetworkBroadcastExec as it was getting very complicated to do cleanly.

kurtvolmar · 2026-02-11T18:52:17Z

        }
    }

+    // @NetworkBoundaryStrategy: strategies last—overwrite annotation when a strategy matches


This section calls NetworkBoundaryStrategy::annotate_plan and, if matches, overwrites the mutable annotation. This allows a NetworkBoundaryStrategy to be selected instead of another default strategy.

kurtvolmar · 2026-02-11T18:53:26Z

+                max_child_task_count,
+                cfg,
+            )
+        }


We use the match on the PlanOrNetworkBoundary::Extension to invoke the NetworkBoundaryStrategy::apply_boundary method.

implement extensibility with NetworkBoundaryStategy

a455d0a

kurtvolmar commented Feb 11, 2026

View reviewed changes

kurtvolmar changed the title ~~implement extensibility with NetworkBoundaryStategy~~ implement extensibility with NetworkBoundaryStrategy Feb 11, 2026

gabrielkerr mentioned this pull request Feb 17, 2026

Need for Customizable Plan Annotation and Network Boundary Logic in DFD datafusion-contrib/datafusion-distributed#347

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement extensibility with NetworkBoundaryStrategy#1

implement extensibility with NetworkBoundaryStrategy#1
kurtvolmar wants to merge 1 commit into
mainfrom
network-boundary-strategy

kurtvolmar commented Feb 11, 2026 •

edited

Loading

Uh oh!

kurtvolmar Feb 11, 2026

Uh oh!

kurtvolmar Feb 11, 2026

Uh oh!

kurtvolmar Feb 11, 2026

Uh oh!

kurtvolmar Feb 11, 2026

Uh oh!

kurtvolmar Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kurtvolmar commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kurtvolmar Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kurtvolmar Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kurtvolmar Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kurtvolmar Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kurtvolmar Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kurtvolmar commented Feb 11, 2026 •

edited

Loading