Add metrics to FFI_ExecutionPlan#22136
Conversation
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
There was a problem hiding this comment.
Pull request overview
This PR extends datafusion-ffi so ExecutionPlan::metrics() can be accessed across the FFI boundary, preserving execution metrics for foreign plans (e.g., for EXPLAIN ANALYZE and DisplayableExecutionPlan::with_metrics(...)) by marshalling metrics as a snapshot.
Changes:
- Add FFI-stable mirrors of
MetricsSet,Metric, and allMetricValuevariants (plus related metric types) with bidirectional conversions. - Add a
metricsfunction pointer toFFI_ExecutionPlanand implementForeignExecutionPlan::metrics()via that callback. - Expose
RatioMetricsinternals needed for marshalling via new accessors; addchronoas a directdatafusion-ffidependency for timestamp conversion.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| datafusion/physical-expr-common/src/metrics/value.rs | Adds RatioMetrics accessors needed to snapshot/serialize merge strategy and display options. |
| datafusion/ffi/src/metrics.rs | Introduces FFI-stable metric types and snapshot conversions, plus round-trip tests. |
| datafusion/ffi/src/lib.rs | Exposes the new metrics module publicly. |
| datafusion/ffi/src/execution_plan.rs | Adds FFI_ExecutionPlan.metrics callback and wires it into ForeignExecutionPlan::metrics(); adds an integration-style test. |
| datafusion/ffi/Cargo.toml | Adds direct chrono dependency for timestamp marshalling. |
| Cargo.lock | Records the new chrono dependency for datafusion-ffi. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Snapshot the plan's execution metrics. Returns `None` when the | ||
| /// underlying [`ExecutionPlan::metrics`] returned `None`. | ||
| pub metrics: unsafe extern "C" fn(plan: &Self) -> FFI_Option<FFI_MetricsSet>, | ||
|
|
| fn timestamp_from_ffi(nanos: FFI_Option<i64>) -> Timestamp { | ||
| let ts = Timestamp::new(); | ||
| if let Some(n) = nanos.into_option() { | ||
| ts.set(DateTime::<Utc>::from_timestamp_nanos(n)); | ||
| } | ||
| ts | ||
| } |
There was a problem hiding this comment.
Will we lose something if we round trip and the original is not UTC?
There was a problem hiding this comment.
That shouldn't be possible. Timestamp is composed of Arc<Mutex<Option<DateTime<Utc>>>>, and is documented as only storing UTC-nanoseconds.
Not sure why it shows up on the FFI enum but not the original
Which issue does this PR close?
ExecutionPlan::metrics()across the FFI boundary #22135Rationale for this change
FFI_ExecutionPlanexposes most of theExecutionPlantrait but does notexpose
metrics(). As a result,ForeignExecutionPlan::metrics()fallsthrough to the trait default (
None), so anything downstream of an FFIboundary loses metrics. The most visible breakage is
EXPLAIN ANALYZE,which renders empty metric blocks for foreign plans; anything calling
DisplayableExecutionPlan::with_metrics(...)on a plan tree containingforeign nodes is similarly affected.
This PR makes foreign plans behave the same as local plans for metric
reporting. Metrics are passed as a snapshot, and all atomic-backed
counters/gauges/timers are read into plain integer fields at marshal time.
Correct because none of the in-tree consumers (
AnalyzeExec,DisplayableExecutionPlan) poll metricsduring streaming.
What changes are included in this PR?
datafusion/ffi/src/metrics.rswith FFI-stable mirrors ofMetricsSet,Metric,MetricValue(all 16 variants),Label,MetricType,MetricCategory,PruningMetrics,RatioMetrics, andRatioMergeStrategy, plus bidirectionalFromconversions.MetricValue::Custom { value: Arc<dyn CustomMetricValue> }is marshalledas
(name, Display output, as_usize()). On the consumer side it isreconstructed as a small
FfiCustomMetricValueshim that preservesDisplayandas_usize().aggregatebecomes a no-op (snapshots arenot mergeable) and
as_anyonly downcasts to the shim — this is thedocumented compromise.
FFI_ExecutionPlangains a newmetricsfunction pointer (appendedafter
repartitioned).ForeignExecutionPlan::metrics()is implementedto call through it.
RatioMetrics:merge_strategy()anddisplay_raw_values()— needed to marshal these otherwise-private fields.chronoadded as a direct dependency ofdatafusion-ffi(used forTimestamp↔ unix-nanos conversion).Are these changes tested?
Yes. New tests, all passing:
datafusion/ffi/src/metrics.rsround-trip everyMetricValuevariant individually, plus a fullMetric(value + labels + partition + type + category) and a
MetricsSet.test_ffi_execution_plan_metrics_round_tripindatafusion/ffi/src/execution_plan.rsexercises the full FFI path:builds an
ExecutionPlanwith aMetricsSet, wraps it inFFI_ExecutionPlan, retrieves metrics viaForeignExecutionPlan::metrics()through
mock_foreign_marker_id, and asserts the aggregated value matches.EmptyExectest helper extended withwith_metrics(MetricsSet).Existing test suites still pass:
cargo test -p datafusion-ffi --all-featuresandcargo test -p datafusion-ffi --features integration-tests.Are there any user-facing changes?
Yes — this PR adds public API and makes a binary-incompatible change to
FFI_ExecutionPlan. Please add theapi changelabel.datafusion_ffi::metrics:FFI_MetricsSet,FFI_Metric,FFI_MetricValue,FFI_Label,FFI_MetricType,FFI_MetricCategory,FFI_PruningMetrics,FFI_RatioMetrics,FFI_RatioMergeStrategy, andFfiCustomMetricValue.FFI_ExecutionPlan: a newmetricsfunction pointerfield is appended. Producers and consumers must be rebuilt together, as
is already enforced by the major-version check via
datafusion_ffi::version().RatioMetrics:merge_strategy()anddisplay_raw_values(). Non-breaking additions.MetricValue::Customacross FFI is lossy by design: the underlyingdyn CustomMetricValueis not preserved; only itsDisplayoutput andas_usize()snapshot survive. Documented onFfiCustomMetricValue.