Skip RowFilter and page pruning for fully matched row groups by xudong963 · Pull Request #21637 · apache/datafusion

xudong963 · 2026-04-15T05:33:43Z

Which issue does this PR close?

Closes Do not evaluate parquet predicates if they can be proven to be false #19028.

Rationale for this change

When DataFusion evaluates a Parquet scan with filter pushdown, it uses row group statistics to determine which row groups contain matching rows. The RowGroupAccessPlanFilter already tracks which row groups are "fully matched" — where statistics prove that all rows satisfy the predicate (via is_fully_matched).

However, this information was not propagated downstream. Even for fully matched row groups:

Page index pruning still evaluated page-level statistics (wasted work since no pages can be pruned)
RowFilter evaluation still decoded filter columns and evaluated the predicate for every row (wasted work since every row passes)

This is especially costly when filter columns are expensive to decode (e.g., large strings) or when predicates are complex. Common real-world examples include time-range filters where entire row groups fall within the range, or WHERE status != 'DELETED' on data with no deleted rows.

What changes are included in this PR?

DataFusion changes (this PR)

row_group_filter.rs: RowGroupAccessPlanFilter::build() now returns (ParquetAccessPlan, Vec<usize>) — the access plan plus the indices of fully matched row groups.
page_filter.rs: prune_plan_with_page_index() accepts a fully_matched_row_groups parameter and skips page-level pruning for those row groups.
opener.rs: Wires fully matched row groups through the pipeline — passes them to page pruning and to the ParquetPushDecoderBuilder via with_fully_matched_row_groups().

Arrow-rs dependency (apache/arrow-rs#9694)

The new ArrowReaderBuilder::with_fully_matched_row_groups() API in arrow-rs allows skipping RowFilter evaluation during Parquet decoding for specified row groups. This PR uses [patch.crates-io] pointing to the arrow-rs fork branch until that PR is merged and released.

Benchmark

Includes a criterion benchmark (parquet_fully_matched_filter) using ParquetPushDecoder directly — the same code path DataFusion's async opener uses. Dataset: 20 row groups × 50K rows, with a 1KB string payload column and predicate x < 200 (all row groups fully matched).

Scenario	Time	vs. baseline
Filter pushdown, no skip	~43 ms	baseline
Filter pushdown, with skip	~20 ms	~2.2x faster
No pushdown at all	~24 ms	—

Are these changes tested?

All 82 existing non-submodule datafusion-datasource-parquet tests pass (16 failures are pre-existing, caused by missing parquet-testing submodule)
The benchmark verifies correctness by asserting the expected row count
Clippy and fmt pass

Are there any user-facing changes?

No user-facing API changes. This is a transparent performance optimization — queries that previously worked will now be faster when row group statistics prove all rows match the predicate.

Note: This PR depends on apache/arrow-rs#9694. The [patch.crates-io] in Cargo.toml will be removed once that arrow-rs change is released. all logic is on df side now

alamb · 2026-05-13T11:49:58Z

@adriangb and I were talking about this PR last night. I am checking it out

alamb

I think this feature is really nice and I like where it is heading @xudong963

I suspect with a few more PRs where we can encapsuate the choice of predicate evaluation strategy, we'll be all set to do more dynamic predicate evaluation

alamb · 2026-05-13T11:52:42Z

            Self::CurrentMemoryUsage(_) => 13,
-            Self::Count { .. } => 14,
+            Self::Count { name, .. } => match name.as_ref() {
+                "page_index_pages_skipped_by_fully_matched" => 8,


this may be worth a comment to explain why it is special casing page_index_pages_skipped_by_fully_matched

Added a comment explaining why this Count is ordered with the Parquet page-index pruning metrics in EXPLAIN output.

alamb · 2026-05-13T11:54:22Z

-        };
+            };
+
+        // Build the first RowFilter eagerly; it will be reused for the first


SOunds good.

I think @adriangb was also talking recently about restructuing the Parquet opener so it could decide more dynamically decide how to evaluate predicates (in this case for example it decides not to evaluate a predicate at all). He was also thinking we could dynamically choose between pushdown predicate into the scan or not

no action required for this PR, I am just commenting here that we seem to be treding in this direction

alamb · 2026-05-13T11:58:12Z

run benchmarks

adriangbot · 2026-05-13T12:00:54Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4440697825-30-mbkj8 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: tpcds
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-13T12:01:27Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4440697825-29-hmbw2 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-13T12:01:47Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4440697825-31-88b5h 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: tpch
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-13T12:17:58Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and datafusion_issue-19028-benchmark
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃ datafusion_issue-19028-benchmark ┃    Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1  │ 39.06 / 40.96 ±2.11 / 44.35 ms │   38.85 / 39.75 ±0.94 / 41.56 ms │ no change │
│ QQuery 2  │ 20.61 / 21.19 ±0.73 / 22.62 ms │   20.49 / 20.96 ±0.37 / 21.54 ms │ no change │
│ QQuery 3  │ 36.53 / 37.31 ±0.93 / 38.78 ms │   35.50 / 36.73 ±0.66 / 37.41 ms │ no change │
│ QQuery 4  │ 17.93 / 18.52 ±0.76 / 20.01 ms │   18.26 / 18.33 ±0.08 / 18.47 ms │ no change │
│ QQuery 5  │ 43.89 / 45.24 ±1.08 / 46.84 ms │   43.88 / 45.68 ±1.47 / 47.26 ms │ no change │
│ QQuery 6  │ 16.81 / 17.94 ±1.11 / 19.65 ms │   16.76 / 17.16 ±0.36 / 17.78 ms │ no change │
│ QQuery 7  │ 48.95 / 50.65 ±1.18 / 52.08 ms │   50.24 / 51.92 ±1.83 / 55.43 ms │ no change │
│ QQuery 8  │ 45.74 / 46.53 ±0.63 / 47.49 ms │   45.99 / 46.18 ±0.15 / 46.44 ms │ no change │
│ QQuery 9  │ 50.89 / 52.03 ±0.84 / 53.39 ms │   51.32 / 52.45 ±1.06 / 54.18 ms │ no change │
│ QQuery 10 │ 64.67 / 65.51 ±1.07 / 67.50 ms │   65.23 / 65.62 ±0.63 / 66.88 ms │ no change │
│ QQuery 11 │ 13.84 / 14.19 ±0.48 / 15.12 ms │   13.95 / 14.43 ±0.53 / 15.43 ms │ no change │
│ QQuery 12 │ 25.62 / 26.03 ±0.32 / 26.46 ms │   25.38 / 26.05 ±0.47 / 26.74 ms │ no change │
│ QQuery 13 │ 35.94 / 36.88 ±0.68 / 37.72 ms │   35.54 / 36.22 ±0.56 / 37.09 ms │ no change │
│ QQuery 14 │ 26.09 / 26.28 ±0.16 / 26.53 ms │   26.11 / 26.34 ±0.18 / 26.63 ms │ no change │
│ QQuery 15 │ 32.08 / 32.32 ±0.20 / 32.68 ms │   32.26 / 33.49 ±1.22 / 35.44 ms │ no change │
│ QQuery 16 │ 14.94 / 15.14 ±0.11 / 15.22 ms │   15.17 / 15.57 ±0.37 / 16.08 ms │ no change │
│ QQuery 17 │ 76.74 / 78.32 ±2.21 / 82.70 ms │   76.21 / 78.27 ±1.55 / 80.27 ms │ no change │
│ QQuery 18 │ 68.30 / 69.77 ±0.85 / 70.79 ms │   68.97 / 69.89 ±0.64 / 70.58 ms │ no change │
│ QQuery 19 │ 35.93 / 36.10 ±0.18 / 36.39 ms │   35.60 / 35.91 ±0.23 / 36.28 ms │ no change │
│ QQuery 20 │ 38.43 / 38.78 ±0.51 / 39.77 ms │   38.63 / 39.04 ±0.37 / 39.62 ms │ no change │
│ QQuery 21 │ 58.64 / 60.05 ±1.61 / 63.05 ms │   59.88 / 60.92 ±0.83 / 62.30 ms │ no change │
│ QQuery 22 │ 23.57 / 23.93 ±0.21 / 24.16 ms │   23.61 / 23.87 ±0.19 / 24.18 ms │ no change │
└───────────┴────────────────────────────────┴──────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                               ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 853.67ms │
│ Total Time (datafusion_issue-19028-benchmark)   │ 854.79ms │
│ Average Time (HEAD)                             │  38.80ms │
│ Average Time (datafusion_issue-19028-benchmark) │  38.85ms │
│ Queries Faster                                  │        0 │
│ Queries Slower                                  │        0 │
│ Queries with No Change                          │       22 │
│ Queries with Failure                            │        0 │
└─────────────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric	Value
Wall time	5.0s
Peak memory	5.5 GiB
Avg memory	5.0 GiB
CPU user	31.5s
CPU sys	2.3s
Peak spill	0 B

tpch — branch

Metric	Value
Wall time	5.0s
Peak memory	5.5 GiB
Avg memory	5.0 GiB
CPU user	31.7s
CPU sys	2.2s
Peak spill	0 B

File an issue against this benchmark runner

adriangbot · 2026-05-13T12:19:41Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and datafusion_issue-19028-benchmark
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃      datafusion_issue-19028-benchmark ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           6.33 / 6.81 ±0.86 / 8.52 ms │           6.25 / 6.81 ±0.88 / 8.56 ms │     no change │
│ QQuery 2  │        81.34 / 81.89 ±0.29 / 82.17 ms │        81.65 / 81.95 ±0.25 / 82.35 ms │     no change │
│ QQuery 3  │        29.06 / 29.29 ±0.19 / 29.57 ms │        29.19 / 29.48 ±0.23 / 29.90 ms │     no change │
│ QQuery 4  │     506.78 / 513.30 ±5.56 / 522.50 ms │     508.53 / 511.81 ±2.20 / 515.01 ms │     no change │
│ QQuery 5  │        53.10 / 53.36 ±0.24 / 53.81 ms │        52.90 / 53.18 ±0.41 / 54.00 ms │     no change │
│ QQuery 6  │        35.62 / 35.84 ±0.29 / 36.41 ms │        35.31 / 35.81 ±0.35 / 36.19 ms │     no change │
│ QQuery 7  │     110.34 / 111.13 ±1.03 / 113.13 ms │     109.80 / 110.44 ±0.92 / 112.26 ms │     no change │
│ QQuery 8  │        38.83 / 39.14 ±0.39 / 39.89 ms │        38.86 / 39.16 ±0.21 / 39.45 ms │     no change │
│ QQuery 9  │        53.43 / 55.60 ±1.94 / 58.99 ms │        55.54 / 57.53 ±1.41 / 59.59 ms │     no change │
│ QQuery 10 │        80.81 / 82.01 ±1.90 / 85.80 ms │        81.42 / 81.74 ±0.20 / 81.96 ms │     no change │
│ QQuery 11 │     315.07 / 318.95 ±2.12 / 321.23 ms │     313.21 / 316.45 ±2.94 / 321.17 ms │     no change │
│ QQuery 12 │        28.90 / 29.35 ±0.33 / 29.76 ms │        28.69 / 29.01 ±0.35 / 29.60 ms │     no change │
│ QQuery 13 │     128.82 / 129.14 ±0.36 / 129.84 ms │     129.01 / 129.37 ±0.24 / 129.65 ms │     no change │
│ QQuery 14 │     513.80 / 516.32 ±2.70 / 520.36 ms │    517.32 / 524.34 ±11.68 / 547.54 ms │     no change │
│ QQuery 15 │        61.20 / 62.38 ±0.70 / 63.19 ms │        60.60 / 61.03 ±0.37 / 61.55 ms │     no change │
│ QQuery 16 │           6.97 / 7.06 ±0.11 / 7.28 ms │           6.87 / 7.04 ±0.17 / 7.37 ms │     no change │
│ QQuery 17 │        83.38 / 84.70 ±0.79 / 85.66 ms │        84.12 / 85.01 ±0.66 / 85.91 ms │     no change │
│ QQuery 18 │     155.78 / 156.15 ±0.29 / 156.52 ms │     156.22 / 157.12 ±0.50 / 157.63 ms │     no change │
│ QQuery 19 │        41.18 / 41.50 ±0.21 / 41.76 ms │        41.26 / 41.57 ±0.21 / 41.88 ms │     no change │
│ QQuery 20 │        34.97 / 35.40 ±0.23 / 35.65 ms │        35.27 / 35.78 ±0.56 / 36.82 ms │     no change │
│ QQuery 21 │        17.87 / 18.26 ±0.50 / 19.22 ms │        18.09 / 18.38 ±0.21 / 18.69 ms │     no change │
│ QQuery 22 │        62.03 / 63.00 ±0.79 / 64.11 ms │        61.27 / 61.86 ±0.33 / 62.21 ms │     no change │
│ QQuery 23 │     478.57 / 483.95 ±4.19 / 490.38 ms │     482.18 / 487.31 ±5.01 / 494.66 ms │     no change │
│ QQuery 24 │     240.48 / 244.49 ±2.05 / 246.01 ms │     245.61 / 246.87 ±1.52 / 249.73 ms │     no change │
│ QQuery 25 │     117.88 / 118.43 ±0.52 / 119.35 ms │     120.23 / 122.24 ±2.86 / 127.90 ms │     no change │
│ QQuery 26 │        71.15 / 71.19 ±0.04 / 71.25 ms │        71.29 / 71.89 ±0.39 / 72.47 ms │     no change │
│ QQuery 27 │           7.05 / 7.17 ±0.17 / 7.50 ms │           7.24 / 7.70 ±0.73 / 9.13 ms │  1.07x slower │
│ QQuery 28 │        58.08 / 63.25 ±2.96 / 66.55 ms │        62.15 / 64.73 ±2.45 / 69.01 ms │     no change │
│ QQuery 29 │     102.09 / 103.90 ±1.47 / 106.44 ms │     103.48 / 104.95 ±1.66 / 107.92 ms │     no change │
│ QQuery 30 │        30.03 / 30.44 ±0.34 / 31.03 ms │        30.36 / 31.12 ±0.89 / 32.81 ms │     no change │
│ QQuery 31 │     111.43 / 112.63 ±1.19 / 114.91 ms │     112.07 / 113.27 ±1.29 / 115.72 ms │     no change │
│ QQuery 32 │        20.25 / 20.60 ±0.29 / 21.14 ms │        20.23 / 20.58 ±0.37 / 21.30 ms │     no change │
│ QQuery 33 │        38.68 / 39.05 ±0.31 / 39.56 ms │        38.90 / 39.22 ±0.25 / 39.55 ms │     no change │
│ QQuery 34 │          9.67 / 9.94 ±0.20 / 10.21 ms │         9.72 / 10.05 ±0.30 / 10.51 ms │     no change │
│ QQuery 35 │        80.41 / 81.47 ±0.84 / 82.38 ms │        81.64 / 82.67 ±1.32 / 85.26 ms │     no change │
│ QQuery 36 │           6.53 / 6.61 ±0.12 / 6.85 ms │           6.54 / 6.67 ±0.16 / 6.97 ms │     no change │
│ QQuery 37 │           7.19 / 7.27 ±0.06 / 7.36 ms │           7.22 / 7.39 ±0.15 / 7.67 ms │     no change │
│ QQuery 38 │        67.74 / 71.08 ±3.73 / 78.31 ms │        68.51 / 69.01 ±0.43 / 69.57 ms │     no change │
│ QQuery 39 │      99.95 / 100.31 ±0.29 / 100.69 ms │     100.70 / 101.81 ±1.01 / 103.48 ms │     no change │
│ QQuery 40 │        23.67 / 23.91 ±0.18 / 24.21 ms │        23.84 / 24.18 ±0.18 / 24.36 ms │     no change │
│ QQuery 41 │        14.05 / 14.16 ±0.17 / 14.51 ms │        14.13 / 14.32 ±0.24 / 14.79 ms │     no change │
│ QQuery 42 │        23.72 / 23.99 ±0.20 / 24.29 ms │        23.85 / 24.32 ±0.41 / 25.04 ms │     no change │
│ QQuery 43 │           5.42 / 5.50 ±0.06 / 5.60 ms │           5.50 / 5.58 ±0.10 / 5.77 ms │     no change │
│ QQuery 44 │        10.86 / 11.05 ±0.16 / 11.32 ms │        11.06 / 11.28 ±0.12 / 11.42 ms │     no change │
│ QQuery 45 │        39.94 / 40.64 ±0.58 / 41.30 ms │        40.63 / 41.07 ±0.36 / 41.63 ms │     no change │
│ QQuery 46 │        13.46 / 13.66 ±0.17 / 13.96 ms │        13.74 / 13.94 ±0.15 / 14.19 ms │     no change │
│ QQuery 47 │     229.37 / 230.99 ±1.44 / 232.87 ms │     230.53 / 231.75 ±0.82 / 232.79 ms │     no change │
│ QQuery 48 │     104.35 / 104.86 ±0.39 / 105.27 ms │     104.72 / 105.27 ±0.31 / 105.54 ms │     no change │
│ QQuery 49 │        81.82 / 82.76 ±0.99 / 84.66 ms │        83.16 / 83.58 ±0.29 / 84.05 ms │     no change │
│ QQuery 50 │        63.06 / 64.69 ±1.80 / 68.14 ms │        63.32 / 65.26 ±2.43 / 70.00 ms │     no change │
│ QQuery 51 │        93.47 / 95.66 ±1.92 / 98.99 ms │        92.85 / 95.51 ±1.71 / 97.62 ms │     no change │
│ QQuery 52 │        24.05 / 24.22 ±0.21 / 24.62 ms │        23.95 / 24.22 ±0.25 / 24.66 ms │     no change │
│ QQuery 53 │        30.03 / 30.29 ±0.18 / 30.59 ms │        30.51 / 30.72 ±0.16 / 30.92 ms │     no change │
│ QQuery 54 │        53.97 / 55.30 ±1.42 / 57.10 ms │        54.69 / 57.19 ±2.21 / 60.30 ms │     no change │
│ QQuery 55 │        23.51 / 23.82 ±0.21 / 24.08 ms │        23.89 / 25.11 ±1.12 / 26.94 ms │  1.05x slower │
│ QQuery 56 │        39.98 / 40.18 ±0.14 / 40.41 ms │        39.53 / 39.96 ±0.31 / 40.23 ms │     no change │
│ QQuery 57 │     176.85 / 179.47 ±1.63 / 181.84 ms │     178.62 / 180.01 ±0.99 / 181.10 ms │     no change │
│ QQuery 58 │     117.35 / 118.77 ±0.97 / 120.26 ms │     118.94 / 120.74 ±1.48 / 122.68 ms │     no change │
│ QQuery 59 │     118.57 / 118.89 ±0.31 / 119.49 ms │     118.49 / 119.05 ±0.48 / 119.89 ms │     no change │
│ QQuery 60 │        39.68 / 39.97 ±0.29 / 40.48 ms │        39.75 / 40.74 ±0.85 / 42.23 ms │     no change │
│ QQuery 61 │        13.66 / 13.80 ±0.21 / 14.21 ms │        13.95 / 14.12 ±0.18 / 14.45 ms │     no change │
│ QQuery 62 │        46.68 / 47.66 ±1.04 / 49.56 ms │        47.21 / 47.56 ±0.41 / 48.31 ms │     no change │
│ QQuery 63 │        30.36 / 30.59 ±0.34 / 31.27 ms │        30.45 / 30.67 ±0.20 / 30.99 ms │     no change │
│ QQuery 64 │     473.74 / 483.26 ±6.58 / 494.36 ms │     480.56 / 484.13 ±3.71 / 489.51 ms │     no change │
│ QQuery 65 │     146.16 / 150.27 ±3.27 / 155.48 ms │     149.59 / 151.52 ±1.72 / 153.54 ms │     no change │
│ QQuery 66 │        83.54 / 83.70 ±0.11 / 83.84 ms │        83.94 / 85.15 ±1.04 / 87.07 ms │     no change │
│ QQuery 67 │     242.45 / 246.63 ±4.22 / 254.34 ms │     243.43 / 246.44 ±2.24 / 249.10 ms │     no change │
│ QQuery 68 │        14.03 / 15.26 ±2.13 / 19.52 ms │        13.81 / 14.09 ±0.38 / 14.85 ms │ +1.08x faster │
│ QQuery 69 │        76.25 / 76.63 ±0.39 / 77.34 ms │        76.65 / 76.82 ±0.15 / 77.01 ms │     no change │
│ QQuery 70 │     105.52 / 107.80 ±2.33 / 111.81 ms │     104.39 / 112.55 ±7.86 / 126.88 ms │     no change │
│ QQuery 71 │        35.70 / 36.22 ±0.45 / 36.97 ms │        35.72 / 35.94 ±0.16 / 36.19 ms │     no change │
│ QQuery 72 │ 2227.11 / 2278.12 ±58.03 / 2355.68 ms │ 2208.23 / 2253.44 ±30.18 / 2289.72 ms │     no change │
│ QQuery 73 │          9.55 / 9.85 ±0.34 / 10.50 ms │         9.68 / 10.01 ±0.20 / 10.24 ms │     no change │
│ QQuery 74 │     178.17 / 180.01 ±1.88 / 183.12 ms │     178.96 / 182.42 ±2.64 / 186.27 ms │     no change │
│ QQuery 75 │     149.34 / 152.11 ±2.80 / 157.45 ms │     150.58 / 152.61 ±1.84 / 154.90 ms │     no change │
│ QQuery 76 │        35.69 / 36.33 ±0.66 / 37.59 ms │        35.92 / 36.35 ±0.34 / 36.96 ms │     no change │
│ QQuery 77 │        61.88 / 62.12 ±0.20 / 62.48 ms │        62.39 / 62.70 ±0.23 / 63.07 ms │     no change │
│ QQuery 78 │     196.03 / 197.85 ±1.44 / 199.95 ms │     195.76 / 197.60 ±2.82 / 203.21 ms │     no change │
│ QQuery 79 │        67.36 / 68.24 ±1.01 / 70.22 ms │        68.38 / 69.72 ±1.38 / 72.28 ms │     no change │
│ QQuery 80 │     104.00 / 106.33 ±2.74 / 110.61 ms │     105.28 / 105.99 ±0.61 / 107.11 ms │     no change │
│ QQuery 81 │        24.40 / 24.84 ±0.35 / 25.26 ms │        24.83 / 24.99 ±0.20 / 25.38 ms │     no change │
│ QQuery 82 │        16.61 / 16.83 ±0.24 / 17.28 ms │        17.17 / 17.71 ±0.78 / 19.26 ms │  1.05x slower │
│ QQuery 83 │        37.45 / 38.39 ±0.98 / 40.09 ms │        38.09 / 39.47 ±1.98 / 43.36 ms │     no change │
│ QQuery 84 │        43.52 / 45.37 ±1.72 / 48.06 ms │        43.08 / 43.86 ±0.65 / 44.84 ms │     no change │
│ QQuery 85 │     136.54 / 138.22 ±1.77 / 141.56 ms │     137.55 / 138.12 ±0.42 / 138.80 ms │     no change │
│ QQuery 86 │        24.94 / 25.63 ±0.69 / 26.86 ms │        25.14 / 26.35 ±1.50 / 29.22 ms │     no change │
│ QQuery 87 │        67.90 / 69.16 ±0.63 / 69.62 ms │        70.72 / 71.05 ±0.26 / 71.39 ms │     no change │
│ QQuery 88 │        64.77 / 65.19 ±0.43 / 65.98 ms │        65.60 / 66.08 ±0.49 / 66.69 ms │     no change │
│ QQuery 89 │        35.80 / 36.02 ±0.28 / 36.56 ms │        36.09 / 36.94 ±0.69 / 38.19 ms │     no change │
│ QQuery 90 │        17.37 / 17.48 ±0.11 / 17.67 ms │        17.76 / 19.23 ±2.16 / 23.53 ms │  1.10x slower │
│ QQuery 91 │        51.38 / 52.32 ±0.88 / 53.84 ms │        52.60 / 52.86 ±0.23 / 53.27 ms │     no change │
│ QQuery 92 │        28.86 / 29.52 ±0.58 / 30.59 ms │        30.11 / 30.54 ±0.37 / 31.06 ms │     no change │
│ QQuery 93 │        53.15 / 54.52 ±1.00 / 56.15 ms │        53.42 / 55.06 ±1.28 / 57.21 ms │     no change │
│ QQuery 94 │        37.84 / 38.11 ±0.25 / 38.54 ms │        38.54 / 39.00 ±0.63 / 40.24 ms │     no change │
│ QQuery 95 │        90.03 / 91.57 ±1.45 / 94.22 ms │        91.04 / 91.75 ±0.48 / 92.44 ms │     no change │
│ QQuery 96 │        24.06 / 24.31 ±0.24 / 24.71 ms │        24.46 / 24.69 ±0.20 / 25.06 ms │     no change │
│ QQuery 97 │        46.74 / 46.84 ±0.13 / 47.08 ms │        46.82 / 47.24 ±0.25 / 47.56 ms │     no change │
│ QQuery 98 │        42.29 / 42.63 ±0.27 / 42.88 ms │        42.54 / 43.39 ±0.69 / 44.27 ms │     no change │
│ QQuery 99 │        70.35 / 70.83 ±0.41 / 71.46 ms │        71.18 / 71.55 ±0.47 / 72.39 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 10695.71ms │
│ Total Time (datafusion_issue-19028-benchmark)   │ 10725.82ms │
│ Average Time (HEAD)                             │   108.04ms │
│ Average Time (datafusion_issue-19028-benchmark) │   108.34ms │
│ Queries Faster                                  │          1 │
│ Queries Slower                                  │          4 │
│ Queries with No Change                          │         94 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric	Value
Wall time	55.0s
Peak memory	6.9 GiB
Avg memory	6.2 GiB
CPU user	247.1s
CPU sys	6.3s
Peak spill	0 B

tpcds — branch

Metric	Value
Wall time	55.0s
Peak memory	6.8 GiB
Avg memory	6.1 GiB
CPU user	247.8s
CPU sys	6.5s
Peak spill	0 B

File an issue against this benchmark runner

adriangbot · 2026-05-13T12:23:56Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and datafusion_issue-19028-benchmark
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃      datafusion_issue-19028-benchmark ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.28 / 4.86 ±6.96 / 18.77 ms │          1.30 / 4.93 ±7.06 / 19.04 ms │     no change │
│ QQuery 1  │        13.03 / 13.41 ±0.27 / 13.76 ms │        13.02 / 13.55 ±0.29 / 13.86 ms │     no change │
│ QQuery 2  │        36.21 / 36.67 ±0.35 / 37.23 ms │        36.84 / 37.21 ±0.39 / 37.93 ms │     no change │
│ QQuery 3  │        31.95 / 32.94 ±0.78 / 34.32 ms │        32.69 / 33.24 ±0.31 / 33.55 ms │     no change │
│ QQuery 4  │     260.15 / 261.68 ±1.86 / 265.30 ms │     270.03 / 273.33 ±2.88 / 277.72 ms │     no change │
│ QQuery 5  │     292.78 / 298.05 ±3.16 / 302.12 ms │     303.23 / 306.71 ±3.04 / 310.80 ms │     no change │
│ QQuery 6  │           6.87 / 7.54 ±0.43 / 8.23 ms │           7.39 / 8.23 ±0.50 / 8.93 ms │  1.09x slower │
│ QQuery 7  │        14.68 / 14.85 ±0.14 / 15.01 ms │        14.56 / 15.87 ±2.19 / 20.23 ms │  1.07x slower │
│ QQuery 8  │     321.11 / 330.67 ±8.00 / 340.69 ms │     353.25 / 355.63 ±1.68 / 357.75 ms │  1.08x slower │
│ QQuery 9  │     511.37 / 522.69 ±8.03 / 534.54 ms │    463.77 / 486.85 ±19.75 / 519.75 ms │ +1.07x faster │
│ QQuery 10 │        73.68 / 74.86 ±0.75 / 75.89 ms │        70.46 / 71.76 ±0.73 / 72.63 ms │     no change │
│ QQuery 11 │        82.81 / 84.92 ±1.32 / 86.03 ms │        81.43 / 82.40 ±0.98 / 84.28 ms │     no change │
│ QQuery 12 │    288.31 / 303.84 ±12.04 / 323.63 ms │     302.90 / 308.51 ±4.72 / 316.56 ms │     no change │
│ QQuery 13 │     388.65 / 402.24 ±9.41 / 416.80 ms │    415.00 / 427.92 ±10.87 / 447.70 ms │  1.06x slower │
│ QQuery 14 │     282.25 / 285.05 ±2.06 / 287.60 ms │     304.01 / 309.51 ±5.09 / 317.91 ms │  1.09x slower │
│ QQuery 15 │     286.19 / 290.24 ±2.51 / 293.56 ms │     308.11 / 317.91 ±6.69 / 327.52 ms │  1.10x slower │
│ QQuery 16 │    617.93 / 661.36 ±26.32 / 686.84 ms │     638.82 / 648.64 ±6.42 / 654.82 ms │     no change │
│ QQuery 17 │    606.05 / 622.28 ±10.18 / 637.73 ms │    618.69 / 656.39 ±30.63 / 700.81 ms │  1.05x slower │
│ QQuery 18 │ 1208.32 / 1238.40 ±22.05 / 1269.37 ms │ 1240.54 / 1287.98 ±37.79 / 1339.04 ms │     no change │
│ QQuery 19 │        28.39 / 28.57 ±0.12 / 28.75 ms │        28.57 / 29.42 ±1.07 / 31.52 ms │     no change │
│ QQuery 20 │     527.74 / 533.74 ±5.53 / 543.34 ms │     520.25 / 523.74 ±3.04 / 528.15 ms │     no change │
│ QQuery 21 │     592.22 / 598.99 ±5.29 / 607.14 ms │    595.00 / 619.36 ±17.20 / 646.68 ms │     no change │
│ QQuery 22 │  1068.43 / 1073.61 ±7.48 / 1088.13 ms │  1063.72 / 1075.73 ±7.33 / 1083.99 ms │     no change │
│ QQuery 23 │ 3277.68 / 3344.81 ±39.91 / 3389.48 ms │ 3206.80 / 3246.19 ±30.52 / 3296.54 ms │     no change │
│ QQuery 24 │        43.06 / 43.73 ±0.79 / 45.08 ms │        43.11 / 43.51 ±0.31 / 43.97 ms │     no change │
│ QQuery 25 │     116.23 / 118.59 ±2.58 / 123.14 ms │     114.00 / 116.23 ±2.17 / 119.28 ms │     no change │
│ QQuery 26 │        43.62 / 45.74 ±2.48 / 50.15 ms │        43.05 / 43.60 ±0.56 / 44.62 ms │     no change │
│ QQuery 27 │     692.66 / 702.00 ±8.65 / 716.35 ms │     689.42 / 701.91 ±9.44 / 717.63 ms │     no change │
│ QQuery 28 │ 3060.62 / 3112.51 ±44.08 / 3188.57 ms │ 3066.56 / 3117.07 ±29.11 / 3147.40 ms │     no change │
│ QQuery 29 │        42.25 / 45.61 ±4.15 / 53.30 ms │       43.38 / 54.59 ±14.29 / 81.85 ms │  1.20x slower │
│ QQuery 30 │    309.64 / 326.75 ±14.89 / 349.07 ms │     326.32 / 330.52 ±4.14 / 336.75 ms │     no change │
│ QQuery 31 │     315.77 / 321.70 ±4.63 / 327.09 ms │    300.28 / 315.14 ±13.35 / 337.23 ms │     no change │
│ QQuery 32 │   915.17 / 956.62 ±35.10 / 1021.29 ms │   947.56 / 977.26 ±20.23 / 1009.07 ms │     no change │
│ QQuery 33 │ 1432.99 / 1479.45 ±35.98 / 1538.25 ms │ 1493.86 / 1574.64 ±67.32 / 1685.74 ms │  1.06x slower │
│ QQuery 34 │  1489.44 / 1491.23 ±1.70 / 1494.36 ms │ 1472.83 / 1543.46 ±89.65 / 1716.84 ms │     no change │
│ QQuery 35 │    306.29 / 340.64 ±40.99 / 420.64 ms │    292.23 / 309.75 ±16.51 / 333.75 ms │ +1.10x faster │
│ QQuery 36 │        65.84 / 77.43 ±8.97 / 92.28 ms │        64.59 / 73.14 ±7.77 / 86.39 ms │ +1.06x faster │
│ QQuery 37 │        37.09 / 39.02 ±2.24 / 43.32 ms │        36.23 / 38.13 ±2.59 / 43.25 ms │     no change │
│ QQuery 38 │        41.85 / 46.57 ±6.19 / 58.43 ms │        40.62 / 44.97 ±3.43 / 51.19 ms │     no change │
│ QQuery 39 │    132.04 / 151.91 ±13.42 / 169.59 ms │     121.68 / 137.07 ±9.16 / 149.72 ms │ +1.11x faster │
│ QQuery 40 │        14.92 / 17.04 ±3.08 / 23.12 ms │        15.31 / 15.57 ±0.30 / 15.98 ms │ +1.09x faster │
│ QQuery 41 │        14.81 / 15.23 ±0.67 / 16.56 ms │        14.57 / 16.70 ±3.29 / 23.24 ms │  1.10x slower │
│ QQuery 42 │        14.39 / 14.53 ±0.11 / 14.70 ms │        14.04 / 14.22 ±0.15 / 14.49 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 20412.54ms │
│ Total Time (datafusion_issue-19028-benchmark)   │ 20608.48ms │
│ Average Time (HEAD)                             │   474.71ms │
│ Average Time (datafusion_issue-19028-benchmark) │   479.27ms │
│ Queries Faster                                  │          5 │
│ Queries Slower                                  │         10 │
│ Queries with No Change                          │         28 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	105.0s
Peak memory	30.5 GiB
Avg memory	23.2 GiB
CPU user	1069.7s
CPU sys	68.5s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	105.0s
Peak memory	29.7 GiB
Avg memory	23.5 GiB
CPU user	1081.3s
CPU sys	69.8s
Peak spill	0 B

File an issue against this benchmark runner

alamb · 2026-05-13T15:06:26Z

run benchmark clickbench_partitioned

alamb · 2026-05-13T15:08:29Z

I want to make sure the slowdowns on clickbench_partitioned in #21637 (comment) are not reproducable

Comparing HEAD and datafusion_issue-19028-benchmark
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃      datafusion_issue-19028-benchmark ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩

...

│ QQuery 6 │ 6.87 / 7.54 ±0.43 / 8.23 ms │ 7.39 / 8.23 ±0.50 / 8.93 ms │ 1.09x slower │
│ QQuery 7 │ 14.68 / 14.85 ±0.14 / 15.01 ms │ 14.56 / 15.87 ±2.19 / 20.23 ms │ 1.07x slower │
│ QQuery 8 │ 321.11 / 330.67 ±8.00 / 340.69 ms │ 353.25 / 355.63 ±1.68 / 357.75 ms │ 1.08x slower │
│ QQuery 9 │ 511.37 / 522.69 ±8.03 / 534.54 ms │ 463.77 / 486.85 ±19.75 / 519.75 ms │ +1.07x faster │
│ QQuery 10 │ 73.68 / 74.86 ±0.75 / 75.89 ms │ 70.46 / 71.76 ±0.73 / 72.63 ms │ no change │
│ QQuery 11 │ 82.81 / 84.92 ±1.32 / 86.03 ms │ 81.43 / 82.40 ±0.98 / 84.28 ms │ no change │
│ QQuery 12 │ 288.31 / 303.84 ±12.04 / 323.63 ms │ 302.90 / 308.51 ±4.72 / 316.56 ms │ no change │
│ QQuery 13 │ 388.65 / 402.24 ±9.41 / 416.80 ms │ 415.00 / 427.92 ±10.87 / 447.70 ms │ 1.06x slower │
│ QQuery 14 │ 282.25 / 285.05 ±2.06 / 287.60 ms │ 304.01 / 309.51 ±5.09 / 317.91 ms │ 1.09x slower │
│ QQuery 15 │ 286.19 / 290.24 ±2.51 / 293.56 ms │ 308.11 / 317.91 ±6.69 / 327.52 ms │ 1.10x slower │
│ QQuery 16 │ 617.93 / 661.36 ±26.32 / 686.84 ms │ 638.82 / 648.64 ±6.42 / 654.82 ms │ no change │
...

adriangbot · 2026-05-13T15:09:02Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4442420069-34-94mwb 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing datafusion/issue-19028-benchmark (67a0526) to 937dfda (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-13T15:23:53Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and datafusion_issue-19028-benchmark
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃      datafusion_issue-19028-benchmark ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.20 / 4.71 ±6.87 / 18.46 ms │          1.19 / 4.62 ±6.74 / 18.09 ms │     no change │
│ QQuery 1  │        12.84 / 13.04 ±0.13 / 13.23 ms │        12.79 / 12.99 ±0.15 / 13.23 ms │     no change │
│ QQuery 2  │        35.68 / 36.12 ±0.41 / 36.79 ms │        35.68 / 35.95 ±0.33 / 36.59 ms │     no change │
│ QQuery 3  │        30.70 / 31.50 ±0.99 / 33.40 ms │        30.63 / 30.94 ±0.41 / 31.69 ms │     no change │
│ QQuery 4  │     233.60 / 235.45 ±2.63 / 240.67 ms │     230.21 / 235.18 ±3.84 / 239.83 ms │     no change │
│ QQuery 5  │     275.77 / 277.86 ±1.64 / 280.62 ms │     279.33 / 280.44 ±0.75 / 281.28 ms │     no change │
│ QQuery 6  │           6.26 / 7.05 ±0.51 / 7.84 ms │           6.97 / 7.56 ±0.56 / 8.34 ms │  1.07x slower │
│ QQuery 7  │        13.87 / 14.08 ±0.14 / 14.25 ms │        13.88 / 14.09 ±0.11 / 14.18 ms │     no change │
│ QQuery 8  │     310.54 / 314.08 ±3.36 / 319.63 ms │     314.54 / 318.09 ±3.44 / 324.00 ms │     no change │
│ QQuery 9  │    446.17 / 467.23 ±17.98 / 494.44 ms │     451.61 / 461.65 ±8.16 / 472.23 ms │     no change │
│ QQuery 10 │        71.15 / 71.45 ±0.27 / 71.87 ms │        69.47 / 70.26 ±0.56 / 71.12 ms │     no change │
│ QQuery 11 │        81.97 / 82.78 ±0.56 / 83.59 ms │        81.58 / 82.28 ±0.49 / 82.86 ms │     no change │
│ QQuery 12 │     284.00 / 289.68 ±3.83 / 295.31 ms │     273.06 / 275.18 ±2.78 / 280.33 ms │ +1.05x faster │
│ QQuery 13 │     383.15 / 400.48 ±9.19 / 410.48 ms │     377.97 / 388.01 ±6.64 / 395.28 ms │     no change │
│ QQuery 14 │     275.41 / 284.01 ±6.85 / 293.85 ms │     279.69 / 282.28 ±2.52 / 287.02 ms │     no change │
│ QQuery 15 │    279.38 / 288.54 ±10.18 / 307.14 ms │     276.74 / 282.74 ±5.42 / 291.76 ms │     no change │
│ QQuery 16 │     603.01 / 613.42 ±7.46 / 625.40 ms │     602.39 / 609.04 ±5.20 / 613.96 ms │     no change │
│ QQuery 17 │     600.41 / 611.91 ±5.95 / 616.26 ms │     602.69 / 612.03 ±9.69 / 629.40 ms │     no change │
│ QQuery 18 │ 1186.35 / 1202.33 ±11.05 / 1219.92 ms │ 1196.92 / 1219.88 ±20.34 / 1255.94 ms │     no change │
│ QQuery 19 │        28.08 / 28.43 ±0.23 / 28.82 ms │        28.61 / 34.63 ±9.22 / 52.60 ms │  1.22x slower │
│ QQuery 20 │     518.17 / 525.93 ±8.65 / 542.67 ms │     518.44 / 524.16 ±4.32 / 531.57 ms │     no change │
│ QQuery 21 │     594.40 / 598.70 ±4.31 / 605.81 ms │     601.02 / 606.83 ±5.42 / 616.78 ms │     no change │
│ QQuery 22 │ 1054.96 / 1066.63 ±11.37 / 1085.65 ms │ 1070.94 / 1083.33 ±13.03 / 1108.18 ms │     no change │
│ QQuery 23 │ 3139.26 / 3177.56 ±45.83 / 3267.61 ms │ 3168.70 / 3189.44 ±11.52 / 3202.29 ms │     no change │
│ QQuery 24 │        41.63 / 42.04 ±0.29 / 42.42 ms │        41.96 / 43.75 ±2.05 / 47.65 ms │     no change │
│ QQuery 25 │     111.12 / 114.11 ±3.96 / 121.85 ms │     113.00 / 114.84 ±1.53 / 117.55 ms │     no change │
│ QQuery 26 │        42.72 / 43.05 ±0.32 / 43.59 ms │        42.28 / 45.02 ±5.00 / 55.02 ms │     no change │
│ QQuery 27 │     674.38 / 679.11 ±3.80 / 684.65 ms │     669.32 / 675.84 ±4.98 / 683.80 ms │     no change │
│ QQuery 28 │ 2987.18 / 3013.02 ±16.76 / 3035.21 ms │ 3001.75 / 3021.89 ±13.05 / 3038.03 ms │     no change │
│ QQuery 29 │        41.57 / 41.78 ±0.13 / 41.96 ms │       41.88 / 47.45 ±10.37 / 68.16 ms │  1.14x slower │
│ QQuery 30 │     295.64 / 300.93 ±2.87 / 304.14 ms │     303.45 / 313.57 ±7.09 / 323.02 ms │     no change │
│ QQuery 31 │     286.17 / 294.29 ±4.42 / 298.96 ms │     287.96 / 292.94 ±3.96 / 299.66 ms │     no change │
│ QQuery 32 │    914.27 / 926.04 ±11.10 / 943.96 ms │    920.53 / 936.34 ±14.00 / 960.00 ms │     no change │
│ QQuery 33 │ 1437.24 / 1454.83 ±14.46 / 1479.56 ms │ 1423.72 / 1447.75 ±17.74 / 1469.95 ms │     no change │
│ QQuery 34 │ 1439.84 / 1471.71 ±34.89 / 1521.63 ms │ 1441.30 / 1458.45 ±10.78 / 1470.96 ms │     no change │
│ QQuery 35 │    285.78 / 300.35 ±15.57 / 324.17 ms │    285.18 / 300.31 ±10.72 / 315.50 ms │     no change │
│ QQuery 36 │        64.31 / 69.57 ±5.49 / 79.81 ms │        67.23 / 71.37 ±4.40 / 79.43 ms │     no change │
│ QQuery 37 │        35.16 / 40.43 ±7.75 / 55.83 ms │        35.69 / 39.22 ±6.24 / 51.68 ms │     no change │
│ QQuery 38 │        40.75 / 45.28 ±6.18 / 57.52 ms │        42.21 / 45.87 ±2.82 / 49.87 ms │     no change │
│ QQuery 39 │     128.30 / 133.37 ±3.40 / 136.81 ms │     130.38 / 134.78 ±2.37 / 137.30 ms │     no change │
│ QQuery 40 │        14.23 / 14.41 ±0.17 / 14.70 ms │        14.29 / 14.60 ±0.18 / 14.81 ms │     no change │
│ QQuery 41 │        13.67 / 13.91 ±0.17 / 14.06 ms │        13.93 / 15.36 ±2.57 / 20.50 ms │  1.10x slower │
│ QQuery 42 │        13.44 / 13.58 ±0.10 / 13.75 ms │        13.09 / 16.53 ±6.05 / 28.62 ms │  1.22x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 19654.76ms │
│ Total Time (datafusion_issue-19028-benchmark)   │ 19697.48ms │
│ Average Time (HEAD)                             │   457.09ms │
│ Average Time (datafusion_issue-19028-benchmark) │   458.08ms │
│ Queries Faster                                  │          1 │
│ Queries Slower                                  │          5 │
│ Queries with No Change                          │         37 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	100.0s
Peak memory	30.1 GiB
Avg memory	23.2 GiB
CPU user	1034.2s
CPU sys	65.2s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	100.0s
Peak memory	30.8 GiB
Avg memory	23.2 GiB
CPU user	1032.7s
CPU sys	67.7s
Peak spill	0 B

File an issue against this benchmark runner

alamb · 2026-05-13T15:59:43Z

🤔 the benchmarks look slower -- maybe we can profile some of those queries and find space to get the performance back

xudong963 · 2026-05-14T05:15:54Z

run benchmark clickbench_partitioned

adriangbot · 2026-05-14T05:19:40Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4447774292-60-bc7n5 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing datafusion/issue-19028-benchmark (d0b4c30) to 937dfda (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-14T05:36:50Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and datafusion_issue-19028-benchmark
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃      datafusion_issue-19028-benchmark ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.23 / 4.77 ±6.88 / 18.54 ms │          1.21 / 4.73 ±6.90 / 18.54 ms │     no change │
│ QQuery 1  │        13.14 / 13.82 ±0.36 / 14.20 ms │        13.10 / 13.57 ±0.24 / 13.73 ms │     no change │
│ QQuery 2  │        36.41 / 36.89 ±0.33 / 37.37 ms │        35.91 / 36.47 ±0.60 / 37.49 ms │     no change │
│ QQuery 3  │        31.18 / 32.19 ±1.68 / 35.53 ms │        30.96 / 31.25 ±0.19 / 31.51 ms │     no change │
│ QQuery 4  │     245.47 / 248.10 ±1.97 / 251.52 ms │     240.31 / 243.94 ±1.94 / 245.61 ms │     no change │
│ QQuery 5  │     289.69 / 292.04 ±2.09 / 295.15 ms │     283.16 / 286.21 ±2.18 / 289.31 ms │     no change │
│ QQuery 6  │          7.35 / 8.23 ±1.25 / 10.70 ms │           7.14 / 7.55 ±0.30 / 7.93 ms │ +1.09x faster │
│ QQuery 7  │        14.99 / 15.06 ±0.05 / 15.14 ms │        14.63 / 15.60 ±1.68 / 18.95 ms │     no change │
│ QQuery 8  │     329.84 / 332.35 ±2.10 / 335.81 ms │     326.69 / 329.96 ±2.83 / 334.21 ms │     no change │
│ QQuery 9  │     473.34 / 479.60 ±6.87 / 491.53 ms │    446.50 / 465.35 ±12.46 / 484.80 ms │     no change │
│ QQuery 10 │        72.19 / 75.45 ±3.71 / 81.99 ms │        71.45 / 76.92 ±9.79 / 96.48 ms │     no change │
│ QQuery 11 │        82.98 / 83.75 ±0.43 / 84.24 ms │        83.61 / 85.38 ±2.84 / 91.04 ms │     no change │
│ QQuery 12 │     282.10 / 285.48 ±3.94 / 291.32 ms │     281.81 / 286.78 ±4.82 / 295.21 ms │     no change │
│ QQuery 13 │     398.45 / 409.27 ±9.41 / 422.68 ms │     401.41 / 412.17 ±7.60 / 424.18 ms │     no change │
│ QQuery 14 │    287.61 / 296.17 ±12.69 / 321.41 ms │     288.95 / 294.25 ±5.60 / 301.78 ms │     no change │
│ QQuery 15 │     286.86 / 291.87 ±2.92 / 295.14 ms │     290.49 / 296.75 ±3.87 / 301.57 ms │     no change │
│ QQuery 16 │     620.23 / 623.02 ±2.27 / 626.90 ms │     620.42 / 628.78 ±8.88 / 645.16 ms │     no change │
│ QQuery 17 │     620.00 / 625.09 ±3.96 / 630.44 ms │     619.57 / 630.20 ±5.84 / 636.81 ms │     no change │
│ QQuery 18 │ 1226.07 / 1247.90 ±20.97 / 1274.73 ms │ 1231.03 / 1247.84 ±19.03 / 1275.61 ms │     no change │
│ QQuery 19 │        29.34 / 32.39 ±3.57 / 36.93 ms │        29.04 / 29.49 ±0.32 / 29.92 ms │ +1.10x faster │
│ QQuery 20 │     514.89 / 526.26 ±9.71 / 543.76 ms │     524.51 / 529.14 ±3.93 / 535.17 ms │     no change │
│ QQuery 21 │     598.32 / 602.65 ±3.94 / 608.79 ms │     598.07 / 601.44 ±2.74 / 605.58 ms │     no change │
│ QQuery 22 │ 1061.09 / 1073.40 ±10.43 / 1091.70 ms │ 1071.14 / 1083.83 ±10.24 / 1102.17 ms │     no change │
│ QQuery 23 │ 3229.64 / 3248.42 ±17.39 / 3280.06 ms │ 3213.65 / 3268.56 ±37.35 / 3312.99 ms │     no change │
│ QQuery 24 │        42.23 / 43.86 ±2.20 / 48.07 ms │        42.60 / 45.64 ±3.71 / 52.89 ms │     no change │
│ QQuery 25 │     113.07 / 118.97 ±9.33 / 137.50 ms │     113.04 / 115.96 ±3.98 / 123.75 ms │     no change │
│ QQuery 26 │        42.64 / 43.96 ±0.97 / 45.59 ms │        43.74 / 44.22 ±0.47 / 45.04 ms │     no change │
│ QQuery 27 │     673.65 / 676.98 ±2.60 / 681.21 ms │     675.30 / 679.67 ±2.61 / 683.52 ms │     no change │
│ QQuery 28 │  3037.54 / 3051.37 ±8.90 / 3062.16 ms │ 3061.57 / 3091.80 ±16.11 / 3105.79 ms │     no change │
│ QQuery 29 │      42.44 / 54.50 ±23.53 / 101.56 ms │       42.29 / 51.23 ±10.74 / 69.20 ms │ +1.06x faster │
│ QQuery 30 │    305.81 / 318.39 ±11.12 / 338.07 ms │     313.65 / 315.60 ±1.17 / 316.92 ms │     no change │
│ QQuery 31 │     298.22 / 303.20 ±3.98 / 310.35 ms │     300.26 / 308.14 ±6.23 / 315.11 ms │     no change │
│ QQuery 32 │    950.76 / 961.60 ±13.71 / 986.07 ms │   954.98 / 997.43 ±34.56 / 1051.93 ms │     no change │
│ QQuery 33 │  1489.08 / 1499.29 ±6.53 / 1508.20 ms │ 1483.52 / 1499.90 ±17.89 / 1529.63 ms │     no change │
│ QQuery 34 │ 1487.18 / 1504.28 ±11.57 / 1520.10 ms │ 1495.27 / 1509.21 ±10.16 / 1522.55 ms │     no change │
│ QQuery 35 │    298.29 / 315.10 ±20.34 / 354.77 ms │    308.15 / 323.74 ±21.29 / 365.51 ms │     no change │
│ QQuery 36 │        69.30 / 73.96 ±4.10 / 80.29 ms │        63.88 / 69.44 ±4.73 / 76.04 ms │ +1.07x faster │
│ QQuery 37 │        36.36 / 39.11 ±2.90 / 43.88 ms │        37.09 / 39.01 ±2.54 / 43.98 ms │     no change │
│ QQuery 38 │        44.22 / 44.75 ±0.36 / 45.17 ms │        44.18 / 47.26 ±3.44 / 52.21 ms │  1.06x slower │
│ QQuery 39 │     133.83 / 141.81 ±5.52 / 148.47 ms │     131.22 / 139.95 ±5.53 / 146.22 ms │     no change │
│ QQuery 40 │        15.29 / 17.88 ±3.37 / 24.53 ms │        15.05 / 15.13 ±0.08 / 15.27 ms │ +1.18x faster │
│ QQuery 41 │        14.09 / 14.71 ±0.48 / 15.32 ms │        14.58 / 14.77 ±0.11 / 14.91 ms │     no change │
│ QQuery 42 │        14.06 / 14.36 ±0.28 / 14.86 ms │        14.08 / 14.79 ±1.06 / 16.89 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 20122.26ms │
│ Total Time (datafusion_issue-19028-benchmark)   │ 20229.03ms │
│ Average Time (HEAD)                             │   467.96ms │
│ Average Time (datafusion_issue-19028-benchmark) │   470.44ms │
│ Queries Faster                                  │          5 │
│ Queries Slower                                  │          1 │
│ Queries with No Change                          │         37 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	105.0s
Peak memory	30.9 GiB
Avg memory	23.2 GiB
CPU user	1058.5s
CPU sys	65.3s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	105.0s
Peak memory	29.9 GiB
Avg memory	23.2 GiB
CPU user	1059.4s
CPU sys	68.5s
Peak spill	0 B

File an issue against this benchmark runner

xudong963 · 2026-05-14T06:05:04Z

🤔 the benchmarks look slower -- maybe we can profile some of those queries and find space to get the performance back

@alamb Good finding to avoid the PR introducing regression!

I profiled the repeated ClickBench partitioned slow queries (q6, q29, and q41) on the PR build.

q29 was dominated by normal parquet decode / aggregation work (snap::decompress, RLE decoding, SumAccumulator), so I did not see a PR-specific hot spot there.

q6 was more useful: it was dominated by parquet open/planning/statistics/metrics setup rather than decode work. In particular, ParquetFileMetrics::new, MetricBuilder::build, and LazyParquetSummaryCount construction/destruction showed up in the sample profile. Since q6 has no filters, this suggested the regression was from fixed per-file setup overhead rather than the fully-matched pruning path itself.

The issue was that ParquetFileMetrics::new created a LazyParquetSummaryCount for page_index_pages_skipped_by_fully_matched for every opened file. Even though the counter was only registered on first use, constructing the lazy wrapper still cloned the filename, cloned the metrics set, and allocated an Arc<OnceLock<_>> for every file, including queries that never used this metric.

I fixed this by removing the per-file LazyParquetSummaryCount field entirely. Page pruning now returns the pages_skipped_by_fully_matched count, and the opener registers page_index_pages_skipped_by_fully_matched only when that count is non-zero, using the already available PreparedParquetOpen filename / partition / metrics context. This keeps ParquetFileMetrics::new off the extra allocation/clone path for the common case.

Now the benchmark is good: #21637 (comment)

github-actions Bot added the datasource Changes to the datasource crate label Apr 15, 2026

xudong963 force-pushed the datafusion/issue-19028-benchmark branch from 54a4166 to 5da11ea Compare April 15, 2026 05:36

This comment has been minimized.

Sign in to view

xudong963 marked this pull request as draft April 15, 2026 05:45

This comment has been minimized.

Sign in to view

github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Apr 15, 2026

This comment has been minimized.

Sign in to view

xudong963 marked this pull request as ready for review April 15, 2026 07:47

This comment has been minimized.

Sign in to view

xudong963 force-pushed the datafusion/issue-19028-benchmark branch from f0e02e9 to d6c3879 Compare April 16, 2026 07:43

This comment has been minimized.

Sign in to view

xudong963 added 3 commits May 13, 2026 09:26

revert: remove split-out PR changes

843d99b

test: order fully matched limit query

382d0fe

fix: report fully matched page-index skips

67a0526

xudong963 force-pushed the datafusion/issue-19028-benchmark branch from 3f2401e to 67a0526 Compare May 13, 2026 01:32

xudong963 requested a review from alamb May 13, 2026 01:55

alamb approved these changes May 13, 2026

View reviewed changes

alamb mentioned this pull request May 13, 2026

Do not evaluate parquet predicates if they can be proven to be false #19028

Open

alamb mentioned this pull request May 13, 2026

feat(parquet): row-group and row-range sampling on ParquetSource #22024

Open

This was referenced May 13, 2026

refactor(parquet-datasource): mechanical cleanup of opener / file_format / row_group_filter #22156

Draft

feat(parquet): two-stage access-plan hooks with shared async reader #22160

Draft

fix: lazily record fully matched page-index skips

d0b4c30

github-actions Bot removed the auto detected api change Auto detected API change label May 14, 2026

docs: explain fully matched page-index metric ordering

426154e

Conversation

xudong963 commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

DataFusion changes (this PR)

Arrow-rs dependency (apache/arrow-rs#9694)

Benchmark

Are these changes tested?

Are there any user-facing changes?

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

alamb commented May 13, 2026

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb May 13, 2026

Choose a reason for hiding this comment

Uh oh!

xudong963 May 14, 2026

Choose a reason for hiding this comment

Uh oh!

alamb May 13, 2026

Choose a reason for hiding this comment

Uh oh!

alamb commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

alamb commented May 13, 2026

Uh oh!

alamb commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

adriangbot commented May 13, 2026

Uh oh!

alamb commented May 13, 2026

Uh oh!

xudong963 commented May 14, 2026

Uh oh!

adriangbot commented May 14, 2026

Uh oh!

adriangbot commented May 14, 2026

Uh oh!

xudong963 commented Apr 15, 2026 •

edited

Loading