Skip to content

docs: refresh Spark version support, OS coverage, and version-pinned examples [WIP]#4244

Open
andygrove wants to merge 9 commits intoapache:mainfrom
andygrove:docs-0.16
Open

docs: refresh Spark version support, OS coverage, and version-pinned examples [WIP]#4244
andygrove wants to merge 9 commits intoapache:mainfrom
andygrove:docs-0.16

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #.

Rationale for this change

Several user-facing docs are out of date ahead of the 0.16 release:

  • The Spark version compatibility, installation, and Gluten comparison pages still describe Spark 4.1 as experimental even though it now runs in full CI alongside Spark SQL tests.
  • The installation page lists Apple macOS (Intel and Apple Silicon) under supported operating systems without noting that the published Maven jars only ship Linux native binaries; Intel macOS is also not exercised by CI any more.
  • Several recently-added math expressions are still marked [ ] in the contributor-guide tracking page.
  • spark.comet.scan.enabled is documented as a user-facing scan toggle, but in practice it exists to let Comet's own tests selectively disable native scans.
  • The Iceberg and Kubernetes guides hard-code older Comet versions (0.14.0 and 0.7.0) in the example commands and YAML, and the Iceberg example also redundantly sets spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions, which CometPlugin already registers.

What changes are included in this PR?

  • Promote Spark 4.1 to fully supported in compatibility/spark-versions.md, installation.md, and about/gluten_comparison.md. The experimental table now lists only Spark 4.2.0-preview4. Spark 4.1.1 is shown as running on JDK 17/21.
  • Restructure the Supported Operating Systems section in installation.md into a table that distinguishes published Maven jars (Linux amd64/arm64) from build-from-source coverage. Drop the Intel macOS row.
  • Flip seven math_funcs entries (acosh, asinh, atanh, cbrt, degrees, pi, radians) from [ ] to [x] in docs/source/contributor-guide/spark_expressions_support.md to reflect commits a4f0229, 1b4b26f, e5351f4, and 356dd94.
  • Move spark.comet.scan.enabled from CATEGORY_SCAN to CATEGORY_TESTING in CometConf.scala and rewrite the doc string to make clear it is intended for Comet's own test suites. Remove the corresponding paragraph from datasources.md.
  • Replace the hard-coded 0.14.0 references in iceberg.md and 0.7.0 references in kubernetes.md with the $COMET_VERSION placeholder used elsewhere in the docs. Drop the redundant spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions conf from both Iceberg spark-shell examples.

How are these changes tested?

Documentation-only changes apart from the CometConf doc string and category move; the CONFIG_TABLE[testing] section in configs.md is regenerated from CometConf by GenerateDocs.scala on merge.

…examples

Update user-facing docs ahead of the 0.16 release:

- Promote Spark 4.1 from experimental to fully supported across the
  installation page, compatibility guide, and Gluten comparison; keep
  4.2 listed as experimental. Spark 4.1.1 now runs in CI under both JDK
  17 and 21.
- Restructure the Supported Operating Systems section in the
  installation guide to make clear that published Maven jars cover
  Linux only and that macOS users must build from source. Drop the
  Intel macOS claim since Apple Silicon is the only macOS variant
  exercised in CI.
- Flip seven recently-added math expressions to supported in the
  contributor-guide tracking page: acosh, asinh, atanh, cbrt, degrees,
  pi, radians.
- Move spark.comet.scan.enabled to the testing category and rewrite
  its description to reflect that it is intended for Comet's own test
  suites only. Remove the corresponding mention from the data sources
  page.
- Replace hard-coded Comet versions in the Iceberg and Kubernetes
  guides with the \$COMET_VERSION placeholder used elsewhere, and drop
  the redundant spark.sql.extensions=...CometSparkSessionExtensions
  conf from the Iceberg examples (CometPlugin registers it
  automatically).
@andygrove andygrove changed the title docs: refresh Spark version support, OS coverage, and version-pinned examples docs: refresh Spark version support, OS coverage, and version-pinned examples [WIP] May 6, 2026
andygrove added 6 commits May 6, 2026 11:19
…t scans

PR apache#4011 added non-AQE DPP and PR apache#4112 added AQE DPP with broadcast
reuse for native Parquet scans, but the docs still claimed AQE DPP was
unsupported. Update the scans compatibility page, the contributor-guide
roadmap, and the Iceberg guide accordingly. AQE DPP for Iceberg native
scans remains future work, tracked at apache#3510.
# Conflicts:
#	docs/source/user-guide/latest/iceberg.md
#	docs/source/user-guide/latest/kubernetes.md
@andygrove andygrove marked this pull request as ready for review May 7, 2026 01:29
Comment on lines +97 to +99
.category(CATEGORY_TESTING)
.doc("Whether to enable native scans. Intended for use in Comet's own test suites to " +
"selectively disable native scans; not intended for production use.")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unrelated to 0.16 / spark 4 changes, but we don't want users using this config

Comment on lines +206 to +208
# corresponding Spark release, so the scala-2.13 profile is not used here.
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-4.0 -DskipTests install
./mvnw "-Dmaven.repo.local=${LOCAL_REPO}" -P spark-4.1 -DskipTests install
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship some new jars

10.042
]
} No newline at end of file
"1": [12.007],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its much nicer

```shell
$SPARK_HOME/bin/spark-shell \
--packages org.apache.datafusion:comet-spark-spark4.1_2.13:0.14.0,org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.8.1,org.apache.iceberg:iceberg-core:1.8.1 \
--packages org.apache.datafusion:comet-spark-spark3.5_2.12:$COMET_VERSION,org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.8.1,org.apache.iceberg:iceberg-core:1.8.1 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.5 also supports 2.13

Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @andygrove

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants