From f3952863eb69048ada33cd6e69a427cb2fa440bb Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Thu, 14 May 2026 14:40:18 -0400 Subject: [PATCH 1/3] Add SQL as a category in breaking APU change policy --- docs/source/contributor-guide/api-health.md | 29 ++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/docs/source/contributor-guide/api-health.md b/docs/source/contributor-guide/api-health.md index f950c7cc0b365..192dcec140fa2 100644 --- a/docs/source/contributor-guide/api-health.md +++ b/docs/source/contributor-guide/api-health.md @@ -25,9 +25,10 @@ changes to avoid issues for downstream users. ## Breaking API Changes + ### What is the public API and what is a breaking API change? -In general, an item is part of the public API if it appears on the [docs.rs page]. +In general, an item is part of the public Rust API if it appears on the [docs.rs page]. Breaking public API changes are those that _require_ users to change their code for it to compile and execute, and are listed as "Major Changes" in the [SemVer @@ -43,6 +44,18 @@ Examples of non-breaking changes include: - Marking a function as deprecated (`#[deprecated]`) - Adding a new function to a `trait` with a default implementation +### What is the public SQL API and what is a breaking API change? + +DataFusion is used extensively as a SQL engine by downstream applications with +real users, and changes to the SQL semantics (the results returned for a given +query) are a form of breaking change. Even if no Rust API signature changes, +altering the result of an existing SQL construct can silently break downstream +users whose applications, dashboards, or tests depend on the previous behavior. + +We therefore apply the same caution to SQL semantics changes as we do to +breaking Rust API changes: the benefit of the change must be weighed against +the cost of breaking downstream users. + ### When to make breaking API changes? When possible, we prefer to avoid making breaking API changes. One common way to @@ -54,15 +67,18 @@ change with the cost (impact on downstream users). It is often frustrating for downstream users to change their applications, and it is even more so if they do not gain improved capabilities. -Examples of good reasons for making a breaking API change include: +Examples of good reasons for making a breaking API or SQL change include: - The change allows new use cases that were not possible before - The change significantly enables improved performance +- The previous behavior is clearly wrong (e.g. it produces incorrect results) -Examples of potentially weak reasons for making breaking API changes include: +Examples of potentially weak reasons for making breaking API or SQL changes include: - The change is an internal refactor to make DataFusion more consistent - The change is to remove an API that is not widely used but has not been marked as deprecated +- The change makes DataFusion slightly more compatible with another database + (for example, PostgreSQL or DuckDB) ### What to do when making breaking API changes? @@ -71,6 +87,13 @@ When making breaking public API changes, please: 1. Add the `api-change` label to the PR so we can highlight the changes in the release notes. 2. Consider adding documentation to the version-specific [Upgrade Guide] if the required changes are non-trivial. +For breaking SQL changes, please: + +1. Clearly describe the previous and new behavior in the PR description, + including a table of example queries and their results where appropriate. + Not only will this make it easier to review, it will make it easier for downstream + users to discover impacted semantics. + [docs.rs page]: https://docs.rs/datafusion/latest/datafusion/index.html [semver compatibility section of the cargo book]: https://doc.rust-lang.org/cargo/reference/semver.html#change-categories From c48aa1bac05c7a1bd2203b847aad3e91eca823a9 Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Thu, 14 May 2026 14:44:07 -0400 Subject: [PATCH 2/3] celanup --- docs/source/contributor-guide/api-health.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/source/contributor-guide/api-health.md b/docs/source/contributor-guide/api-health.md index 192dcec140fa2..37dabba3538cf 100644 --- a/docs/source/contributor-guide/api-health.md +++ b/docs/source/contributor-guide/api-health.md @@ -25,8 +25,7 @@ changes to avoid issues for downstream users. ## Breaking API Changes - -### What is the public API and what is a breaking API change? +### What is the public Rust API and what is a breaking API change? In general, an item is part of the public Rust API if it appears on the [docs.rs page]. @@ -44,11 +43,11 @@ Examples of non-breaking changes include: - Marking a function as deprecated (`#[deprecated]`) - Adding a new function to a `trait` with a default implementation -### What is the public SQL API and what is a breaking API change? +### What is the public SQL API and what is a breaking SQL change? DataFusion is used extensively as a SQL engine by downstream applications with real users, and changes to the SQL semantics (the results returned for a given -query) are a form of breaking change. Even if no Rust API signature changes, +query) are a form of breaking change. Even if no Rust API signatures change, altering the result of an existing SQL construct can silently break downstream users whose applications, dashboards, or tests depend on the previous behavior. From cb692d46e83aae421ebb8c70a5d324771bf38334 Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Thu, 14 May 2026 14:58:55 -0400 Subject: [PATCH 3/3] clean --- docs/source/contributor-guide/api-health.md | 56 ++++++++++----------- 1 file changed, 26 insertions(+), 30 deletions(-) diff --git a/docs/source/contributor-guide/api-health.md b/docs/source/contributor-guide/api-health.md index 37dabba3538cf..a20bc284cf362 100644 --- a/docs/source/contributor-guide/api-health.md +++ b/docs/source/contributor-guide/api-health.md @@ -27,11 +27,11 @@ changes to avoid issues for downstream users. ### What is the public Rust API and what is a breaking API change? -In general, an item is part of the public Rust API if it appears on the [docs.rs page]. +An item is part of the public Rust API if it appears on the [docs.rs page]. -Breaking public API changes are those that _require_ users to change their code -for it to compile and execute, and are listed as "Major Changes" in the [SemVer -Compatibility Section of the Cargo Book]. Common examples of breaking changes include: +Breaking changes _require_ users to modify their code for it to compile and +run, and are listed as "Major Changes" in the [SemVer Compatibility Section of +the Cargo Book]. Common examples include: - Adding new required parameters to a function (`foo(a: i32, b: i32)` -> `foo(a: i32, b: i32, c: i32)`) - Removing a `pub` function @@ -45,15 +45,13 @@ Examples of non-breaking changes include: ### What is the public SQL API and what is a breaking SQL change? -DataFusion is used extensively as a SQL engine by downstream applications with -real users, and changes to the SQL semantics (the results returned for a given -query) are a form of breaking change. Even if no Rust API signatures change, -altering the result of an existing SQL construct can silently break downstream -users whose applications, dashboards, or tests depend on the previous behavior. +DataFusion is also used as a SQL engine, so changes to SQL semantics (the +results returned for a given query) are a form of breaking change. Even with +no Rust API change, altering the behavior of an existing SQL construct can +silently break downstream applications, dashboards, and tests. -We therefore apply the same caution to SQL semantics changes as we do to -breaking Rust API changes: the benefit of the change must be weighed against -the cost of breaking downstream users. +We apply the same caution to SQL semantics changes as to Rust API changes: +the benefit must be weighed against the cost of breaking downstream users. ### When to make breaking API changes? @@ -66,32 +64,30 @@ change with the cost (impact on downstream users). It is often frustrating for downstream users to change their applications, and it is even more so if they do not gain improved capabilities. -Examples of good reasons for making a breaking API or SQL change include: +Examples of good reasons for a breaking API or SQL change: -- The change allows new use cases that were not possible before -- The change significantly enables improved performance -- The previous behavior is clearly wrong (e.g. it produces incorrect results) +- It enables new use cases that were not possible before +- It significantly improves performance +- The previous behavior is clearly wrong (e.g. produces incorrect results) -Examples of potentially weak reasons for making breaking API or SQL changes include: +Examples of potentially weak reasons: -- The change is an internal refactor to make DataFusion more consistent -- The change is to remove an API that is not widely used but has not been marked as deprecated -- The change makes DataFusion slightly more compatible with another database - (for example, PostgreSQL or DuckDB) +- An internal refactor to make DataFusion more consistent +- Removing an API that is not widely used but has not been marked as deprecated +- Slightly improving compatibility with another database (for example, + PostgreSQL or DuckDB) ### What to do when making breaking API changes? -When making breaking public API changes, please: +When making breaking Rust API changes, please: -1. Add the `api-change` label to the PR so we can highlight the changes in the release notes. -2. Consider adding documentation to the version-specific [Upgrade Guide] if the required changes are non-trivial. +1. Add the `api-change` label so the change is highlighted in the release notes. +2. Document non-trivial changes in the version-specific [Upgrade Guide]. -For breaking SQL changes, please: - -1. Clearly describe the previous and new behavior in the PR description, - including a table of example queries and their results where appropriate. - Not only will this make it easier to review, it will make it easier for downstream - users to discover impacted semantics. +For breaking SQL changes, also describe the previous and new behavior in the PR +description, ideally including example queries and results where appropriate. +This makes review easier and helps downstream users discover the affected +semantics. [docs.rs page]: https://docs.rs/datafusion/latest/datafusion/index.html [semver compatibility section of the cargo book]: https://doc.rust-lang.org/cargo/reference/semver.html#change-categories