Skip RowFilter and page pruning for fully matched row groups#21637
Skip RowFilter and page pruning for fully matched row groups#21637xudong963 wants to merge 15 commits into
Conversation
54a4166 to
5da11ea
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
f0e02e9 to
d6c3879
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
3f2401e to
67a0526
Compare
|
@adriangb and I were talking about this PR last night. I am checking it out |
alamb
left a comment
There was a problem hiding this comment.
I think this feature is really nice and I like where it is heading @xudong963
I suspect with a few more PRs where we can encapsuate the choice of predicate evaluation strategy, we'll be all set to do more dynamic predicate evaluation
| Self::CurrentMemoryUsage(_) => 13, | ||
| Self::Count { .. } => 14, | ||
| Self::Count { name, .. } => match name.as_ref() { | ||
| "page_index_pages_skipped_by_fully_matched" => 8, |
There was a problem hiding this comment.
this may be worth a comment to explain why it is special casing page_index_pages_skipped_by_fully_matched
There was a problem hiding this comment.
Added a comment explaining why this Count is ordered with the Parquet page-index pruning metrics in EXPLAIN output.
| }; | ||
| }; | ||
|
|
||
| // Build the first RowFilter eagerly; it will be reused for the first |
There was a problem hiding this comment.
SOunds good.
I think @adriangb was also talking recently about restructuing the Parquet opener so it could decide more dynamically decide how to evaluate predicates (in this case for example it decides not to evaluate a predicate at all). He was also thinking we could dynamically choose between pushdown predicate into the scan or not
no action required for this PR, I am just commenting here that we seem to be treding in this direction
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
run benchmark clickbench_partitioned |
|
I want to make sure the slowdowns on clickbench_partitioned in #21637 (comment) are not reproducable
...
|
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤔 the benchmarks look slower -- maybe we can profile some of those queries and find space to get the performance back |
|
run benchmark clickbench_partitioned |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing datafusion/issue-19028-benchmark (d0b4c30) to 937dfda (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
@alamb Good finding to avoid the PR introducing regression! I profiled the repeated ClickBench partitioned slow queries (
The issue was that I fixed this by removing the per-file Now the benchmark is good: #21637 (comment) |
Which issue does this PR close?
Rationale for this change
When DataFusion evaluates a Parquet scan with filter pushdown, it uses row group statistics to determine which row groups contain matching rows. The
RowGroupAccessPlanFilteralready tracks which row groups are "fully matched" — where statistics prove that all rows satisfy the predicate (viais_fully_matched).However, this information was not propagated downstream. Even for fully matched row groups:
This is especially costly when filter columns are expensive to decode (e.g., large strings) or when predicates are complex. Common real-world examples include time-range filters where entire row groups fall within the range, or
WHERE status != 'DELETED'on data with no deleted rows.What changes are included in this PR?
DataFusion changes (this PR)
row_group_filter.rs:RowGroupAccessPlanFilter::build()now returns(ParquetAccessPlan, Vec<usize>)— the access plan plus the indices of fully matched row groups.page_filter.rs:prune_plan_with_page_index()accepts afully_matched_row_groupsparameter and skips page-level pruning for those row groups.opener.rs: Wires fully matched row groups through the pipeline — passes them to page pruning and to theParquetPushDecoderBuilderviawith_fully_matched_row_groups().Arrow-rs dependency (apache/arrow-rs#9694)
The new
ArrowReaderBuilder::with_fully_matched_row_groups()API in arrow-rs allows skippingRowFilterevaluation during Parquet decoding for specified row groups. This PR uses[patch.crates-io]pointing to the arrow-rs fork branch until that PR is merged and released.Benchmark
Includes a criterion benchmark (
parquet_fully_matched_filter) usingParquetPushDecoderdirectly — the same code path DataFusion's async opener uses. Dataset: 20 row groups × 50K rows, with a 1KB string payload column and predicatex < 200(all row groups fully matched).Are these changes tested?
datafusion-datasource-parquettests pass (16 failures are pre-existing, caused by missingparquet-testingsubmodule)Are there any user-facing changes?
No user-facing API changes. This is a transparent performance optimization — queries that previously worked will now be faster when row group statistics prove all rows match the predicate.
Note: This PR depends on apache/arrow-rs#9694. Theall logic is on df side now[patch.crates-io]inCargo.tomlwill be removed once that arrow-rs change is released.