feat: globally reorder files and row groups by statistics for TopK queries by zhuqi-lucas · Pull Request #21956 · apache/datafusion

zhuqi-lucas · 2026-04-30T09:23:53Z

Which issue does this PR close?

Closes Sort pushdown: reorder row groups by statistics within each file #21317
Partial Closes feat: global file reorder in shared work queue for TopK optimization #21733
Partial fix for [Epic] Dynamic row group pruning using TopK threshold during parquet scan #21399
Split from feat: statistics-driven TopK optimization for parquet (file reorder + RG reorder + threshold init + cumulative prune) #21580 (per @alamb's request to break into smaller PRs)

Rationale for this change

TopK queries (ORDER BY col LIMIT K) on parquet files with multiple out-of-order row groups are suboptimal — the dynamic filter threshold converges slowly because row groups are read in arbitrary order. By reordering row groups so the "best" ones (containing optimal values) are read first, the threshold tightens quickly and subsequent row groups are pruned at runtime.

What changes are included in this PR?

Row group reorder by statistics:

PreparedAccessPlan::reorder_by_statistics(): sorts row groups by min values (ASC) using parquet column statistics. Direction (DESC) is handled by existing reverse() applied after reorder. The two compose correctly for both sorted and unsorted data.
AccessPlanOptimizer trait: extensible interface for row group access plan optimizations applied after pruning.

DynamicFilter sort metadata:

DynamicFilterPhysicalExpr gains sort_options and fetch fields, set by SortExec::create_filter(). This lets the parquet reader determine reorder direction for any TopK query (not just sort-pushdown path).
Fix: SortExec::with_fetch now sets fetch before calling create_filter() so the DynamicFilter gets the correct K value.

File-level reorder in shared work queue:

FileSource::reorder_files() trait method + parquet implementation: reorders files in the shared work queue by statistics so multi-file TopK reads the most promising files first across all partitions.

Are these changes tested?

SLT tests: Tests H (mixed RGs), J (scrambled non-overlapping), K (overlapping), L (multi-key ORDER BY), P (multi-file reorder)
All existing sort_pushdown SLTs pass
98 parquet lib unit tests pass
Clippy clean, rustdoc clean

Are there any user-facing changes?

No API changes. TopK queries on parquet with multiple row groups will automatically benefit from better row group ordering. This is a performance optimization only — query results are unchanged.

Copilot

Pull request overview

This PR improves TopK (ORDER BY ... LIMIT K) performance for Parquet scans by using statistics to reorder work (files and row groups) so the dynamic filter threshold converges faster and more data can be pruned during execution.

Changes:

Propagate TopK sort metadata (sort_options, fetch) via DynamicFilterPhysicalExpr and fix SortExec::with_fetch ordering so the dynamic filter sees the correct K.
Add file-level reordering hook (FileSource::reorder_files) and implement statistics-based file reordering for Parquet.
Add row-group access plan optimization plumbing and statistics-based row-group reordering (composable with reverse scanning), plus SLT coverage.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
datafusion/sqllogictest/test_files/sort_pushdown.slt	Adds SLT coverage for row-group/file reordering behavior across multiple TopK scenarios
datafusion/physical-plan/src/sorts/sort.rs	Passes sort options + fetch into dynamic filter and adjusts fetch/filter initialization order
datafusion/physical-expr/src/expressions/dynamic_filters.rs	Extends `DynamicFilterPhysicalExpr` with optional sort metadata and fetch limit
datafusion/datasource/src/file_stream/work_source.rs	Calls `FileSource::reorder_files` before queueing shared work
datafusion/datasource/src/file.rs	Introduces `FileSource::reorder_files` default extension point
datafusion/datasource-parquet/src/source.rs	Implements Parquet file reordering using column statistics; wires fallback sort info
datafusion/datasource-parquet/src/opener.rs	Applies row-group access plan optimizers (reorder + reverse) based on TopK metadata
datafusion/datasource-parquet/src/mod.rs	Registers new access plan optimizer module
datafusion/datasource-parquet/src/access_plan_optimizer.rs	Adds `AccessPlanOptimizer` trait plus `ReverseRowGroups` / `ReorderByStatistics` implementations
datafusion/datasource-parquet/src/access_plan.rs	Adds `PreparedAccessPlan::reorder_by_statistics` (min-stat based RG ordering)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

zhuqi-lucas · 2026-04-30T09:54:22Z

run benchmark sort_pushdown_inexact

adriangbot · 2026-04-30T09:57:49Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4351456416-1947-jscm6 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/rg-reorder-by-statistics (235c4e1) to 0144570 (merge-base) diff using: sort_pushdown_inexact
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-04-30T10:12:44Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and feat_rg-reorder-by-statistics
--------------------
Benchmark sort_pushdown_inexact.json
--------------------
┏━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query ┃                           HEAD ┃  feat_rg-reorder-by-statistics ┃        Change ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1    │    7.22 / 8.08 ±0.90 / 9.82 ms │   6.71 / 7.47 ±1.27 / 10.00 ms │ +1.08x faster │
│ Q2    │    6.79 / 6.88 ±0.08 / 7.01 ms │    6.87 / 7.08 ±0.23 / 7.48 ms │     no change │
│ Q3    │ 21.76 / 22.52 ±0.64 / 23.37 ms │ 21.99 / 22.88 ±0.57 / 23.66 ms │     no change │
│ Q4    │ 20.82 / 21.86 ±0.84 / 22.82 ms │ 20.73 / 21.74 ±0.77 / 22.71 ms │     no change │
└───────┴────────────────────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Benchmark Summary                            ┃         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ Total Time (HEAD)                            │ 59.34ms │
│ Total Time (feat_rg-reorder-by-statistics)   │ 59.17ms │
│ Average Time (HEAD)                          │ 14.83ms │
│ Average Time (feat_rg-reorder-by-statistics) │ 14.79ms │
│ Queries Faster                               │       1 │
│ Queries Slower                               │       0 │
│ Queries with No Change                       │       3 │
│ Queries with Failure                         │       0 │
└──────────────────────────────────────────────┴─────────┘

Resource Usage

sort_pushdown_inexact — base (merge-base)

Metric	Value
Wall time	5.0s
Peak memory	4.6 GiB
Avg memory	4.6 GiB
CPU user	2.5s
CPU sys	0.4s
Peak spill	0 B

sort_pushdown_inexact — branch

Metric	Value
Wall time	5.0s
Peak memory	4.6 GiB
Avg memory	4.6 GiB
CPU user	2.6s
CPU sys	0.3s
Peak spill	0 B

File an issue against this benchmark runner

zhuqi-lucas · 2026-04-30T10:15:42Z

The benchmark results are expected — RG reorder alone doesn't skip any row groups, it only changes the read order so that TopK's dynamic filter threshold converges faster.

The significant speedup (2-3x on sort_pushdown_inexact) comes from stats init + cumulative RG prune which will be in the follow-up PR. Those optimizations depend on RG reorder as a foundation:

RG reorder: put best RGs first (this PR)
Stats init: initialize TopK threshold from RG statistics before reading → prune RGs upfront (next PR)
Cumulative prune: after reorder, truncate remaining RGs once enough rows are collected (next PR)

Without reorder, cumulative prune might truncate the wrong RGs. Reorder ensures the best RGs come first, making truncation safe and effective.

alamb · 2026-05-04T18:00:18Z

Sorry @zhuqi-lucas - I am really struggling these days tor review all these large PRs. Last year it was very rare to see a more than 1000 line PR and now we get multiple such PRs a day.

I will try and find the time to review this more carefuly

alamb · 2026-05-04T18:00:46Z

run benchmark clickbench_partitioned

adriangbot · 2026-05-04T18:04:18Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4373271959-2014-w8h9b 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/rg-reorder-by-statistics (f7d9156) to 0144570 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-04T18:19:51Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and feat_rg-reorder-by-statistics
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃         feat_rg-reorder-by-statistics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.23 / 4.82 ±7.04 / 18.90 ms │          1.21 / 4.82 ±7.05 / 18.92 ms │     no change │
│ QQuery 1  │        12.47 / 13.00 ±0.29 / 13.32 ms │        12.72 / 12.98 ±0.18 / 13.19 ms │     no change │
│ QQuery 2  │        38.63 / 39.11 ±0.35 / 39.47 ms │        37.97 / 38.31 ±0.34 / 38.80 ms │     no change │
│ QQuery 3  │        33.07 / 33.52 ±0.51 / 34.49 ms │        33.41 / 33.64 ±0.22 / 33.98 ms │     no change │
│ QQuery 4  │     242.23 / 249.99 ±9.73 / 267.45 ms │     271.06 / 274.49 ±3.09 / 278.63 ms │  1.10x slower │
│ QQuery 5  │     284.47 / 296.56 ±9.28 / 306.71 ms │     299.65 / 308.25 ±4.93 / 314.88 ms │     no change │
│ QQuery 6  │           6.96 / 7.23 ±0.18 / 7.46 ms │           6.65 / 7.27 ±0.56 / 8.05 ms │     no change │
│ QQuery 7  │        14.21 / 14.72 ±0.31 / 15.02 ms │        14.41 / 16.07 ±3.18 / 22.43 ms │  1.09x slower │
│ QQuery 8  │     356.10 / 358.94 ±2.33 / 363.04 ms │     328.55 / 340.34 ±8.92 / 354.19 ms │ +1.05x faster │
│ QQuery 9  │     487.77 / 497.42 ±8.53 / 511.30 ms │     439.88 / 453.05 ±9.21 / 467.33 ms │ +1.10x faster │
│ QQuery 10 │        76.69 / 80.03 ±5.15 / 90.25 ms │        74.55 / 77.13 ±3.18 / 83.38 ms │     no change │
│ QQuery 11 │        84.54 / 86.98 ±1.52 / 88.97 ms │        86.04 / 87.25 ±1.24 / 89.04 ms │     no change │
│ QQuery 12 │     282.89 / 284.46 ±1.23 / 285.94 ms │    275.05 / 289.89 ±11.97 / 305.74 ms │     no change │
│ QQuery 13 │     396.27 / 404.06 ±5.67 / 410.90 ms │     403.21 / 412.12 ±9.24 / 429.06 ms │     no change │
│ QQuery 14 │     282.35 / 286.07 ±3.02 / 291.18 ms │     287.05 / 296.09 ±7.05 / 306.18 ms │     no change │
│ QQuery 15 │     279.47 / 287.60 ±5.65 / 294.51 ms │    294.22 / 320.57 ±14.31 / 336.64 ms │  1.11x slower │
│ QQuery 16 │    638.58 / 658.97 ±22.69 / 690.33 ms │     667.64 / 674.35 ±6.76 / 683.20 ms │     no change │
│ QQuery 17 │    622.84 / 637.11 ±12.84 / 654.17 ms │    621.26 / 638.40 ±15.17 / 660.66 ms │     no change │
│ QQuery 18 │ 1241.87 / 1316.27 ±41.71 / 1358.85 ms │ 1258.81 / 1307.11 ±34.18 / 1348.22 ms │     no change │
│ QQuery 19 │        28.46 / 31.56 ±4.31 / 40.10 ms │        28.59 / 28.80 ±0.19 / 29.06 ms │ +1.10x faster │
│ QQuery 20 │     529.29 / 537.43 ±5.02 / 544.19 ms │     523.97 / 532.45 ±8.68 / 549.23 ms │     no change │
│ QQuery 21 │     607.66 / 614.19 ±5.53 / 621.01 ms │     605.27 / 612.57 ±8.27 / 625.13 ms │     no change │
│ QQuery 22 │ 1070.59 / 1085.27 ±13.00 / 1107.09 ms │ 1072.45 / 1090.41 ±10.78 / 1104.44 ms │     no change │
│ QQuery 23 │ 3314.00 / 3437.61 ±78.34 / 3523.07 ms │ 3387.58 / 3448.97 ±39.32 / 3497.47 ms │     no change │
│ QQuery 24 │        42.20 / 44.81 ±3.57 / 51.79 ms │        51.45 / 55.62 ±5.18 / 65.81 ms │  1.24x slower │
│ QQuery 25 │     113.70 / 114.49 ±0.66 / 115.32 ms │     128.16 / 128.61 ±0.37 / 129.21 ms │  1.12x slower │
│ QQuery 26 │        42.35 / 45.65 ±6.31 / 58.25 ms │        52.63 / 55.33 ±3.39 / 61.79 ms │  1.21x slower │
│ QQuery 27 │     674.46 / 685.83 ±7.29 / 695.20 ms │     688.07 / 695.41 ±4.37 / 701.69 ms │     no change │
│ QQuery 28 │ 3020.28 / 3048.63 ±23.33 / 3084.84 ms │ 3010.28 / 3074.39 ±33.31 / 3105.94 ms │     no change │
│ QQuery 29 │        43.07 / 43.94 ±1.04 / 45.91 ms │       43.87 / 51.79 ±10.11 / 69.72 ms │  1.18x slower │
│ QQuery 30 │     307.14 / 315.88 ±6.92 / 323.79 ms │     334.93 / 343.24 ±8.01 / 356.83 ms │  1.09x slower │
│ QQuery 31 │     308.72 / 314.42 ±4.56 / 321.89 ms │    314.12 / 323.87 ±12.27 / 348.06 ms │     no change │
│ QQuery 32 │ 1018.11 / 1070.76 ±36.10 / 1118.10 ms │  1005.36 / 1016.45 ±7.84 / 1027.16 ms │ +1.05x faster │
│ QQuery 33 │ 1496.48 / 1556.75 ±49.74 / 1616.42 ms │ 1436.75 / 1491.87 ±40.63 / 1545.17 ms │     no change │
│ QQuery 34 │ 1453.29 / 1513.04 ±45.83 / 1572.31 ms │ 1451.91 / 1498.81 ±46.92 / 1588.77 ms │     no change │
│ QQuery 35 │    295.44 / 321.06 ±19.17 / 353.20 ms │    328.30 / 358.07 ±40.25 / 436.97 ms │  1.12x slower │
│ QQuery 36 │        64.29 / 70.64 ±7.87 / 86.00 ms │        66.27 / 67.78 ±1.06 / 69.29 ms │     no change │
│ QQuery 37 │        35.20 / 37.71 ±2.70 / 41.59 ms │        34.99 / 37.66 ±2.90 / 41.75 ms │     no change │
│ QQuery 38 │        39.83 / 44.27 ±4.33 / 50.79 ms │        39.84 / 44.31 ±5.22 / 54.12 ms │     no change │
│ QQuery 39 │    124.07 / 143.17 ±12.43 / 157.24 ms │     136.46 / 143.86 ±6.15 / 153.13 ms │     no change │
│ QQuery 40 │        15.94 / 17.45 ±2.47 / 22.36 ms │        14.40 / 17.46 ±4.12 / 25.40 ms │     no change │
│ QQuery 41 │        14.73 / 15.13 ±0.27 / 15.55 ms │        13.61 / 15.07 ±2.48 / 20.01 ms │     no change │
│ QQuery 42 │        14.25 / 14.48 ±0.20 / 14.83 ms │        13.15 / 15.82 ±4.95 / 25.71 ms │  1.09x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 20680.99ms │
│ Total Time (feat_rg-reorder-by-statistics)   │ 20740.75ms │
│ Average Time (HEAD)                          │   480.95ms │
│ Average Time (feat_rg-reorder-by-statistics) │   482.34ms │
│ Queries Faster                               │          4 │
│ Queries Slower                               │         10 │
│ Queries with No Change                       │         29 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	105.0s
Peak memory	30.8 GiB
Avg memory	23.5 GiB
CPU user	1091.4s
CPU sys	68.7s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	105.0s
Peak memory	29.3 GiB
Avg memory	22.9 GiB
CPU user	1090.4s
CPU sys	70.5s
Peak spill	0 B

File an issue against this benchmark runner

zhuqi-lucas · 2026-05-05T08:30:37Z

Sorry @zhuqi-lucas - I am really struggling these days tor review all these large PRs. Last year it was very rare to see a more than 1000 line PR and now we get multiple such PRs a day.

I will try and find the time to review this more carefuly

Thanks @alamb, no rush at all, I really appreciate you taking the time.

alamb · 2026-05-06T20:19:22Z

the benchmark results look mixed -- I'll see if that reproduces on a second run

alamb · 2026-05-06T20:19:31Z

run benchmark clickbench_partitioned

adriangbot · 2026-05-06T20:22:50Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4391783062-2042-6k9px 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/rg-reorder-by-statistics (f7d9156) to 0144570 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-06T20:38:15Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and feat_rg-reorder-by-statistics
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃         feat_rg-reorder-by-statistics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.20 / 4.70 ±6.87 / 18.44 ms │          1.22 / 4.79 ±6.97 / 18.74 ms │     no change │
│ QQuery 1  │        12.75 / 13.17 ±0.22 / 13.34 ms │        12.40 / 12.90 ±0.28 / 13.23 ms │     no change │
│ QQuery 2  │        36.81 / 37.18 ±0.33 / 37.61 ms │        37.22 / 37.42 ±0.25 / 37.88 ms │     no change │
│ QQuery 3  │        31.71 / 32.19 ±0.54 / 33.22 ms │        31.94 / 32.44 ±0.39 / 32.96 ms │     no change │
│ QQuery 4  │     247.78 / 250.01 ±2.06 / 253.50 ms │     253.69 / 257.03 ±1.73 / 258.56 ms │     no change │
│ QQuery 5  │     287.78 / 289.52 ±1.03 / 290.85 ms │     289.42 / 292.64 ±2.07 / 295.08 ms │     no change │
│ QQuery 6  │           7.29 / 7.81 ±0.67 / 9.10 ms │           6.90 / 7.16 ±0.18 / 7.46 ms │ +1.09x faster │
│ QQuery 7  │        14.28 / 14.36 ±0.07 / 14.49 ms │        13.95 / 14.08 ±0.09 / 14.16 ms │     no change │
│ QQuery 8  │     336.38 / 340.04 ±2.13 / 342.05 ms │     341.09 / 343.49 ±1.90 / 345.33 ms │     no change │
│ QQuery 9  │     461.94 / 467.43 ±5.31 / 477.00 ms │     467.44 / 472.74 ±3.67 / 477.75 ms │     no change │
│ QQuery 10 │        74.48 / 76.10 ±1.88 / 79.71 ms │        75.52 / 77.16 ±2.19 / 81.23 ms │     no change │
│ QQuery 11 │        85.73 / 86.07 ±0.26 / 86.42 ms │        85.99 / 87.54 ±1.12 / 89.28 ms │     no change │
│ QQuery 12 │     280.98 / 285.44 ±3.52 / 290.79 ms │     282.81 / 287.61 ±5.21 / 296.21 ms │     no change │
│ QQuery 13 │     404.18 / 411.13 ±9.26 / 429.29 ms │     405.10 / 414.80 ±8.21 / 425.37 ms │     no change │
│ QQuery 14 │     292.37 / 297.84 ±3.26 / 301.90 ms │     293.62 / 295.19 ±1.64 / 297.82 ms │     no change │
│ QQuery 15 │     291.47 / 294.53 ±2.30 / 297.44 ms │     291.66 / 300.90 ±7.54 / 312.14 ms │     no change │
│ QQuery 16 │     627.68 / 641.18 ±9.98 / 655.69 ms │     639.85 / 641.51 ±1.67 / 644.29 ms │     no change │
│ QQuery 17 │     638.42 / 647.21 ±7.23 / 655.68 ms │     640.67 / 645.68 ±3.67 / 651.56 ms │     no change │
│ QQuery 18 │ 1290.83 / 1314.60 ±14.71 / 1329.99 ms │ 1297.43 / 1309.34 ±10.13 / 1325.00 ms │     no change │
│ QQuery 19 │        29.23 / 32.41 ±5.77 / 43.94 ms │        29.21 / 34.33 ±5.83 / 44.03 ms │  1.06x slower │
│ QQuery 20 │     522.82 / 528.86 ±6.61 / 541.51 ms │     523.02 / 528.33 ±3.26 / 532.18 ms │     no change │
│ QQuery 21 │     603.90 / 608.73 ±2.96 / 612.83 ms │     603.88 / 609.67 ±6.41 / 621.84 ms │     no change │
│ QQuery 22 │  1081.15 / 1087.43 ±6.25 / 1096.41 ms │  1068.50 / 1080.17 ±6.12 / 1086.37 ms │     no change │
│ QQuery 23 │ 3384.13 / 3408.97 ±14.66 / 3428.08 ms │ 3395.45 / 3408.41 ±11.08 / 3423.79 ms │     no change │
│ QQuery 24 │        42.03 / 42.47 ±0.33 / 42.83 ms │       51.50 / 61.01 ±13.32 / 86.16 ms │  1.44x slower │
│ QQuery 25 │     114.26 / 119.46 ±5.17 / 128.73 ms │     127.97 / 132.42 ±4.01 / 139.39 ms │  1.11x slower │
│ QQuery 26 │        42.83 / 44.42 ±1.62 / 47.38 ms │        52.32 / 53.26 ±1.06 / 55.01 ms │  1.20x slower │
│ QQuery 27 │     679.50 / 681.37 ±1.81 / 684.19 ms │     669.76 / 682.80 ±8.33 / 693.59 ms │     no change │
│ QQuery 28 │ 3012.02 / 3049.65 ±22.48 / 3077.02 ms │ 3027.95 / 3046.49 ±13.96 / 3063.12 ms │     no change │
│ QQuery 29 │       42.80 / 49.16 ±11.04 / 71.17 ms │        42.94 / 45.42 ±3.96 / 53.26 ms │ +1.08x faster │
│ QQuery 30 │     313.88 / 318.51 ±3.26 / 323.73 ms │     316.46 / 319.91 ±4.37 / 328.18 ms │     no change │
│ QQuery 31 │     307.95 / 317.28 ±6.52 / 325.61 ms │     307.66 / 313.96 ±3.83 / 318.14 ms │     no change │
│ QQuery 32 │ 1030.36 / 1047.67 ±22.70 / 1092.49 ms │ 1040.24 / 1052.69 ±10.45 / 1069.35 ms │     no change │
│ QQuery 33 │ 1451.84 / 1477.87 ±15.35 / 1498.49 ms │ 1469.74 / 1503.70 ±41.90 / 1584.18 ms │     no change │
│ QQuery 34 │ 1482.73 / 1500.06 ±11.41 / 1518.36 ms │ 1476.76 / 1492.89 ±15.81 / 1516.11 ms │     no change │
│ QQuery 35 │    310.09 / 330.24 ±12.15 / 344.45 ms │     301.82 / 307.84 ±4.94 / 315.16 ms │ +1.07x faster │
│ QQuery 36 │        64.00 / 65.09 ±0.85 / 66.22 ms │        64.17 / 66.87 ±1.97 / 68.78 ms │     no change │
│ QQuery 37 │        35.91 / 41.29 ±5.84 / 52.66 ms │        36.03 / 42.85 ±5.55 / 50.91 ms │     no change │
│ QQuery 38 │        43.19 / 45.62 ±2.60 / 50.61 ms │        41.06 / 44.60 ±3.39 / 50.81 ms │     no change │
│ QQuery 39 │     131.17 / 140.16 ±6.52 / 150.63 ms │     126.02 / 138.27 ±9.57 / 151.72 ms │     no change │
│ QQuery 40 │        15.14 / 18.55 ±3.41 / 23.30 ms │        14.85 / 15.28 ±0.53 / 16.29 ms │ +1.21x faster │
│ QQuery 41 │        14.55 / 16.04 ±2.52 / 21.05 ms │        14.25 / 16.53 ±2.79 / 21.83 ms │     no change │
│ QQuery 42 │        13.94 / 14.63 ±1.03 / 16.68 ms │        13.86 / 15.94 ±3.84 / 23.61 ms │  1.09x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 20496.45ms │
│ Total Time (feat_rg-reorder-by-statistics)   │ 20548.05ms │
│ Average Time (HEAD)                          │   476.66ms │
│ Average Time (feat_rg-reorder-by-statistics) │   477.86ms │
│ Queries Faster                               │          4 │
│ Queries Slower                               │          5 │
│ Queries with No Change                       │         34 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	105.0s
Peak memory	29.0 GiB
Avg memory	23.1 GiB
CPU user	1079.7s
CPU sys	67.2s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	105.0s
Peak memory	30.0 GiB
Avg memory	23.2 GiB
CPU user	1082.3s
CPU sys	68.1s
Peak spill	0 B

File an issue against this benchmark runner

zhuqi-lucas · 2026-05-07T07:21:14Z

the benchmark results look mixed -- I'll see if that reproduces on a second run

Thanks @alamb , i update the PR to skip reorder when overlap is heavy, and let me trigger again to see if it's the reason.

zhuqi-lucas · 2026-05-07T07:21:23Z

run benchmark clickbench_partitioned

adriangbot · 2026-05-07T07:24:19Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4395019893-2044-wkdsl 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/rg-reorder-by-statistics (c8fd321) to 0c38ebb (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-07T07:26:22Z

Benchmark for this request failed.

Last 20 lines of output:

Click to expand

   Compiling async-compression v0.4.42
   Compiling mimalloc v0.1.50
   Compiling parquet v58.2.0
   Compiling datafusion-common v53.1.0 (/workspace/datafusion-branch/datafusion/common)
   Compiling datafusion-expr-common v53.1.0 (/workspace/datafusion-branch/datafusion/expr-common)
   Compiling datafusion-physical-expr-common v53.1.0 (/workspace/datafusion-branch/datafusion/physical-expr-common)
   Compiling datafusion-functions-aggregate-common v53.1.0 (/workspace/datafusion-branch/datafusion/functions-aggregate-common)
   Compiling datafusion-functions-window-common v53.1.0 (/workspace/datafusion-branch/datafusion/functions-window-common)
   Compiling datafusion-expr v53.1.0 (/workspace/datafusion-branch/datafusion/expr)
   Compiling datafusion-physical-expr v53.1.0 (/workspace/datafusion-branch/datafusion/physical-expr)
   Compiling datafusion-execution v53.1.0 (/workspace/datafusion-branch/datafusion/execution)
error[E0063]: missing fields `fetch` and `sort_options` in initializer of `DynamicFilterPhysicalExpr`
   --> datafusion/physical-expr/src/expressions/dynamic_filters.rs:457:9
    |
457 |         Self {
    |         ^^^^ missing `fetch` and `sort_options`

For more information about this error, try `rustc --explain E0063`.
error: could not compile `datafusion-physical-expr` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...

File an issue against this benchmark runner

zhuqi-lucas · 2026-05-07T07:56:51Z

run benchmark clickbench_partitioned

adriangbot · 2026-05-07T08:00:06Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4395242557-2045-x5dxd 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/rg-reorder-by-statistics (4fac37b) to 3b634aa (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-07T08:21:01Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and feat_rg-reorder-by-statistics
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃         feat_rg-reorder-by-statistics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.17 / 4.64 ±6.78 / 18.19 ms │          1.18 / 4.64 ±6.77 / 18.18 ms │     no change │
│ QQuery 1  │        12.74 / 12.91 ±0.12 / 13.07 ms │        13.03 / 13.28 ±0.14 / 13.46 ms │     no change │
│ QQuery 2  │        35.86 / 36.47 ±0.48 / 37.25 ms │        36.47 / 36.68 ±0.19 / 37.02 ms │     no change │
│ QQuery 3  │        30.84 / 31.38 ±0.64 / 32.65 ms │        30.91 / 31.05 ±0.08 / 31.14 ms │     no change │
│ QQuery 4  │     232.70 / 236.25 ±3.06 / 241.83 ms │     233.22 / 234.32 ±0.79 / 235.30 ms │     no change │
│ QQuery 5  │     279.72 / 281.78 ±1.81 / 284.54 ms │     276.91 / 280.80 ±3.00 / 285.58 ms │     no change │
│ QQuery 6  │           6.77 / 7.03 ±0.19 / 7.28 ms │           6.67 / 7.32 ±0.40 / 7.93 ms │     no change │
│ QQuery 7  │        13.90 / 13.95 ±0.04 / 14.01 ms │        14.09 / 14.27 ±0.12 / 14.45 ms │     no change │
│ QQuery 8  │     318.01 / 323.88 ±3.52 / 327.40 ms │     315.04 / 318.04 ±2.01 / 320.95 ms │     no change │
│ QQuery 9  │     455.54 / 459.86 ±3.17 / 463.50 ms │     446.87 / 455.95 ±4.81 / 461.11 ms │     no change │
│ QQuery 10 │        70.11 / 71.04 ±0.81 / 72.49 ms │        69.85 / 70.59 ±0.75 / 71.72 ms │     no change │
│ QQuery 11 │        79.31 / 81.02 ±1.24 / 82.69 ms │        81.09 / 81.49 ±0.30 / 81.82 ms │     no change │
│ QQuery 12 │     271.17 / 277.33 ±5.06 / 286.31 ms │     271.78 / 276.31 ±3.98 / 282.18 ms │     no change │
│ QQuery 13 │     385.11 / 389.27 ±4.84 / 398.56 ms │    379.76 / 391.53 ±11.83 / 413.39 ms │     no change │
│ QQuery 14 │     277.49 / 281.96 ±5.86 / 293.26 ms │     279.02 / 281.91 ±3.46 / 288.60 ms │     no change │
│ QQuery 15 │     277.09 / 283.44 ±7.77 / 298.74 ms │    279.24 / 292.29 ±16.55 / 324.16 ms │     no change │
│ QQuery 16 │     602.36 / 608.37 ±5.41 / 618.30 ms │     598.97 / 609.14 ±6.33 / 616.06 ms │     no change │
│ QQuery 17 │     597.12 / 606.79 ±5.90 / 613.61 ms │     605.86 / 612.02 ±4.38 / 617.78 ms │     no change │
│ QQuery 18 │ 1176.93 / 1203.07 ±17.94 / 1228.67 ms │ 1192.41 / 1212.50 ±16.52 / 1241.91 ms │     no change │
│ QQuery 19 │        27.97 / 29.61 ±2.84 / 35.30 ms │        28.61 / 33.03 ±5.34 / 40.93 ms │  1.12x slower │
│ QQuery 20 │    517.04 / 534.38 ±20.06 / 572.51 ms │     517.60 / 521.11 ±2.95 / 525.47 ms │     no change │
│ QQuery 21 │     593.63 / 600.06 ±6.83 / 613.09 ms │     592.76 / 596.20 ±2.79 / 600.88 ms │     no change │
│ QQuery 22 │  1054.85 / 1065.40 ±6.04 / 1071.06 ms │  1057.93 / 1061.61 ±3.43 / 1067.11 ms │     no change │
│ QQuery 23 │ 3179.68 / 3199.56 ±19.83 / 3234.14 ms │ 3179.00 / 3198.97 ±17.74 / 3231.12 ms │     no change │
│ QQuery 24 │        41.46 / 45.28 ±6.36 / 57.95 ms │       50.98 / 59.23 ±12.31 / 82.98 ms │  1.31x slower │
│ QQuery 25 │     111.80 / 116.28 ±6.84 / 129.76 ms │     125.66 / 127.45 ±1.84 / 130.99 ms │  1.10x slower │
│ QQuery 26 │        42.68 / 44.11 ±1.24 / 46.17 ms │        51.71 / 54.22 ±2.30 / 57.85 ms │  1.23x slower │
│ QQuery 27 │     667.87 / 673.95 ±3.71 / 678.18 ms │     667.56 / 677.21 ±5.68 / 683.81 ms │     no change │
│ QQuery 28 │ 3016.59 / 3042.43 ±17.92 / 3062.27 ms │ 3012.13 / 3034.15 ±20.18 / 3058.70 ms │     no change │
│ QQuery 29 │        41.77 / 47.71 ±6.74 / 56.79 ms │        42.23 / 42.63 ±0.38 / 43.35 ms │ +1.12x faster │
│ QQuery 30 │     298.30 / 305.97 ±4.71 / 313.11 ms │     305.36 / 307.69 ±1.86 / 310.87 ms │     no change │
│ QQuery 31 │     292.07 / 297.00 ±3.46 / 302.75 ms │     292.94 / 303.92 ±5.58 / 308.42 ms │     no change │
│ QQuery 32 │    924.04 / 939.96 ±14.42 / 962.45 ms │     924.15 / 933.31 ±4.61 / 936.34 ms │     no change │
│ QQuery 33 │ 1427.23 / 1447.61 ±14.59 / 1461.84 ms │ 1442.44 / 1456.91 ±13.11 / 1472.40 ms │     no change │
│ QQuery 34 │ 1441.71 / 1465.05 ±20.53 / 1499.54 ms │  1443.15 / 1457.73 ±9.33 / 1470.00 ms │     no change │
│ QQuery 35 │     289.98 / 293.98 ±4.21 / 300.35 ms │     285.80 / 298.74 ±9.90 / 313.48 ms │     no change │
│ QQuery 36 │        62.03 / 69.22 ±5.26 / 76.69 ms │        62.26 / 65.20 ±2.70 / 69.95 ms │ +1.06x faster │
│ QQuery 37 │        34.98 / 38.29 ±4.73 / 47.66 ms │        36.23 / 41.25 ±6.23 / 50.43 ms │  1.08x slower │
│ QQuery 38 │        41.57 / 43.01 ±1.07 / 44.34 ms │        41.45 / 46.92 ±4.19 / 54.05 ms │  1.09x slower │
│ QQuery 39 │     123.13 / 134.85 ±6.70 / 140.97 ms │     133.31 / 136.35 ±3.01 / 141.58 ms │     no change │
│ QQuery 40 │        14.40 / 15.87 ±1.48 / 18.32 ms │        14.49 / 16.58 ±2.14 / 19.67 ms │     no change │
│ QQuery 41 │        13.96 / 16.76 ±4.46 / 25.53 ms │        13.98 / 14.12 ±0.13 / 14.29 ms │ +1.19x faster │
│ QQuery 42 │        13.46 / 13.91 ±0.31 / 14.29 ms │        13.46 / 14.29 ±1.30 / 16.88 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 19690.71ms │
│ Total Time (feat_rg-reorder-by-statistics)   │ 19723.00ms │
│ Average Time (HEAD)                          │   457.92ms │
│ Average Time (feat_rg-reorder-by-statistics) │   458.67ms │
│ Queries Faster                               │          3 │
│ Queries Slower                               │          6 │
│ Queries with No Change                       │         34 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	100.0s
Peak memory	30.3 GiB
Avg memory	23.3 GiB
CPU user	1034.8s
CPU sys	64.5s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	100.0s
Peak memory	30.8 GiB
Avg memory	23.2 GiB
CPU user	1037.7s
CPU sys	64.0s
Peak spill	0 B

File an issue against this benchmark runner

zhuqi-lucas · 2026-05-07T09:59:53Z

run benchmark clickbench_partitioned

adriangbot · 2026-05-07T10:03:09Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4396120969-2047-dtfjq 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing feat/rg-reorder-by-statistics (6987aa2) to 3b634aa (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-05-07T10:20:01Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and feat_rg-reorder-by-statistics
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃         feat_rg-reorder-by-statistics ┃       Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.20 / 4.54 ±6.64 / 17.82 ms │          1.18 / 4.53 ±6.64 / 17.80 ms │    no change │
│ QQuery 1  │        12.65 / 12.95 ±0.21 / 13.29 ms │        12.69 / 12.80 ±0.08 / 12.94 ms │    no change │
│ QQuery 2  │        35.28 / 35.76 ±0.40 / 36.38 ms │        36.42 / 36.85 ±0.24 / 37.13 ms │    no change │
│ QQuery 3  │        30.06 / 31.22 ±0.67 / 32.11 ms │        30.29 / 30.98 ±0.64 / 32.10 ms │    no change │
│ QQuery 4  │     226.08 / 231.19 ±4.52 / 238.42 ms │     224.28 / 230.47 ±3.90 / 235.85 ms │    no change │
│ QQuery 5  │     274.93 / 277.88 ±2.84 / 282.48 ms │     271.91 / 274.81 ±2.46 / 278.80 ms │    no change │
│ QQuery 6  │           6.70 / 7.19 ±0.38 / 7.88 ms │           6.96 / 7.19 ±0.20 / 7.45 ms │    no change │
│ QQuery 7  │        14.65 / 14.73 ±0.05 / 14.80 ms │        13.89 / 14.00 ±0.07 / 14.07 ms │    no change │
│ QQuery 8  │     310.14 / 311.16 ±1.04 / 312.56 ms │     309.27 / 311.98 ±2.40 / 316.25 ms │    no change │
│ QQuery 9  │     439.85 / 445.48 ±3.05 / 448.27 ms │     441.40 / 447.12 ±3.32 / 450.62 ms │    no change │
│ QQuery 10 │        69.32 / 70.75 ±1.75 / 74.14 ms │        67.98 / 68.91 ±0.74 / 70.01 ms │    no change │
│ QQuery 11 │        79.46 / 81.95 ±2.52 / 86.49 ms │        79.14 / 81.80 ±3.87 / 89.44 ms │    no change │
│ QQuery 12 │     266.87 / 272.45 ±4.43 / 278.74 ms │     266.41 / 271.18 ±3.75 / 277.01 ms │    no change │
│ QQuery 13 │     379.90 / 383.16 ±3.69 / 389.79 ms │     375.59 / 387.25 ±8.17 / 395.91 ms │    no change │
│ QQuery 14 │     277.36 / 279.44 ±2.48 / 284.23 ms │     275.12 / 277.48 ±3.51 / 284.43 ms │    no change │
│ QQuery 15 │     274.41 / 282.55 ±7.97 / 296.75 ms │     270.13 / 275.62 ±2.99 / 278.56 ms │    no change │
│ QQuery 16 │     596.55 / 602.55 ±4.30 / 606.63 ms │     594.32 / 600.81 ±6.79 / 610.42 ms │    no change │
│ QQuery 17 │     592.31 / 601.73 ±6.66 / 610.27 ms │     596.44 / 601.46 ±4.97 / 610.21 ms │    no change │
│ QQuery 18 │ 1178.91 / 1199.76 ±12.20 / 1212.30 ms │ 1169.33 / 1197.90 ±22.74 / 1236.30 ms │    no change │
│ QQuery 19 │        27.95 / 30.09 ±3.53 / 37.11 ms │        27.82 / 30.38 ±3.23 / 36.60 ms │    no change │
│ QQuery 20 │     512.92 / 519.73 ±8.68 / 536.84 ms │     512.18 / 521.16 ±4.74 / 525.35 ms │    no change │
│ QQuery 21 │     588.58 / 592.61 ±2.29 / 595.25 ms │     589.30 / 592.77 ±3.88 / 599.85 ms │    no change │
│ QQuery 22 │  1040.85 / 1056.30 ±8.94 / 1066.12 ms │ 1041.61 / 1063.66 ±20.53 / 1102.01 ms │    no change │
│ QQuery 23 │ 3118.55 / 3157.66 ±24.96 / 3194.14 ms │ 3144.45 / 3156.48 ±10.48 / 3169.82 ms │    no change │
│ QQuery 24 │        42.08 / 42.40 ±0.52 / 43.43 ms │        41.35 / 41.85 ±0.45 / 42.62 ms │    no change │
│ QQuery 25 │     110.58 / 111.78 ±0.76 / 112.74 ms │     109.72 / 113.20 ±3.10 / 117.57 ms │    no change │
│ QQuery 26 │        42.96 / 43.92 ±0.83 / 45.38 ms │        42.11 / 42.86 ±0.73 / 44.27 ms │    no change │
│ QQuery 27 │     667.69 / 670.60 ±2.81 / 675.36 ms │     670.00 / 680.00 ±6.94 / 688.73 ms │    no change │
│ QQuery 28 │ 2998.13 / 3026.91 ±19.25 / 3056.91 ms │ 2991.41 / 3011.03 ±10.55 / 3023.18 ms │    no change │
│ QQuery 29 │        41.38 / 45.32 ±4.68 / 51.80 ms │       41.51 / 51.16 ±18.67 / 88.48 ms │ 1.13x slower │
│ QQuery 30 │     297.64 / 303.09 ±5.28 / 310.03 ms │     298.33 / 301.00 ±3.74 / 308.39 ms │    no change │
│ QQuery 31 │     285.72 / 291.25 ±5.58 / 301.58 ms │     284.19 / 294.46 ±6.25 / 300.59 ms │    no change │
│ QQuery 32 │    900.75 / 920.82 ±20.37 / 959.83 ms │    902.56 / 918.05 ±12.10 / 937.23 ms │    no change │
│ QQuery 33 │  1419.13 / 1423.97 ±4.76 / 1433.02 ms │ 1424.74 / 1463.19 ±42.15 / 1540.75 ms │    no change │
│ QQuery 34 │ 1418.53 / 1441.87 ±12.92 / 1456.64 ms │ 1422.49 / 1435.21 ±12.63 / 1454.89 ms │    no change │
│ QQuery 35 │     285.31 / 289.58 ±4.05 / 296.74 ms │    278.13 / 303.63 ±41.76 / 386.76 ms │    no change │
│ QQuery 36 │        61.88 / 65.00 ±3.48 / 71.56 ms │        61.72 / 64.21 ±2.61 / 69.08 ms │    no change │
│ QQuery 37 │        34.74 / 36.41 ±2.20 / 40.71 ms │        34.61 / 34.87 ±0.17 / 35.12 ms │    no change │
│ QQuery 38 │        40.21 / 43.86 ±2.86 / 47.05 ms │        43.50 / 49.88 ±6.79 / 58.78 ms │ 1.14x slower │
│ QQuery 39 │     124.07 / 130.60 ±4.69 / 137.02 ms │     122.69 / 130.08 ±5.30 / 138.70 ms │    no change │
│ QQuery 40 │        14.07 / 14.29 ±0.27 / 14.80 ms │        13.63 / 16.02 ±3.44 / 22.85 ms │ 1.12x slower │
│ QQuery 41 │        13.53 / 13.65 ±0.12 / 13.82 ms │        13.30 / 13.74 ±0.32 / 14.26 ms │    no change │
│ QQuery 42 │       13.04 / 18.95 ±11.22 / 41.38 ms │        13.24 / 18.26 ±9.74 / 37.75 ms │    no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 19437.10ms │
│ Total Time (feat_rg-reorder-by-statistics)   │ 19480.29ms │
│ Average Time (HEAD)                          │   452.03ms │
│ Average Time (feat_rg-reorder-by-statistics) │   453.03ms │
│ Queries Faster                               │          0 │
│ Queries Slower                               │          3 │
│ Queries with No Change                       │         40 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	100.0s
Peak memory	30.7 GiB
Avg memory	23.2 GiB
CPU user	1025.5s
CPU sys	60.0s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	100.0s
Peak memory	31.7 GiB
Avg memory	23.3 GiB
CPU user	1025.6s
CPU sys	60.4s
Peak spill	0 B

File an issue against this benchmark runner

zhuqi-lucas · 2026-05-07T12:08:58Z

The benchmark is ok now.

adriangb

Main concerns:

I don't think we should be (IMO) abusing dynamic filters to pass sort information.
I don't think the traits introduced are helpful.

Can you explain why we need to use dynamic filters like this?

adriangb · 2026-05-14T23:00:04Z

+            match arrow::compute::sort_to_indices(&stat_mins, Some(sort_options), None) {
+                Ok(indices) => indices,
+                Err(e) => {
+                    debug!("Skipping RG reorder: sort failed: {e}");


Is this expected to happen normally? If not maybe this should be an info or a warn. Maybe we even add a debug assertion so it fails in our CI?

adriangb · 2026-05-14T23:00:15Z

+        let stat_mins = match converter.row_group_mins(rg_metadata.iter().copied()) {
+            Ok(vals) => vals,
+            Err(e) => {
+                debug!("Skipping RG reorder: cannot get min values: {e}");


Same note as https://github.com/apache/datafusion/pull/21956/changes#r3244789303.

If this is expected a docstring explaining why would be great. If unexpected... then we should at least error in CI.

adriangb · 2026-05-14T23:02:19Z

+        file_metadata: &ParquetMetaData,
+        _arrow_schema: &Schema,
+    ) -> Result<PreparedAccessPlan> {
+        plan.reverse(file_metadata)


I like the idea but is it really worth having a trait just to hide 1 line behind it?

adriangb · 2026-05-14T23:03:24Z

+        // 1. reorder_by_statistics: sort RGs by min values (ASC) to align
+        //    with the file's declared output ordering. This fixes out-of-order


This comment doesn't make sense to me. A file either declares an output ordering or not. If it does the row groups should already be sorted by it. Should this say the query's desired output ordering?

adriangb · 2026-05-14T23:04:30Z

+        } else if let Some(predicate) = &prepared.predicate
+            && let Some(df) = find_dynamic_filter(predicate)
+            && let Some(sort_options) = df.sort_options()
+            && !sort_options.is_empty()
+        {
+            // Build a sort order from DynamicFilter for non-sort-pushdown TopK.
+            // Quick bail: check if the sort column exists in file schema.


Why is this needed? It seems kinda hacky / a side channel when we already have sort pushdown APIs. What is "non-sort-pushdown TopK"?

adriangb · 2026-05-14T23:05:26Z

+        let reverse_optimizer: Option<
+            Box<dyn crate::access_plan_optimizer::AccessPlanOptimizer>,
+        > = if is_descending {
+            Some(Box::new(crate::access_plan_optimizer::ReverseRowGroups))
+        } else {
+            None
+        };


I'm not convinced by this abstraction. The implementations are trivial, it's more code overall and it also obscures the code execution.

adriangb · 2026-05-14T23:05:57Z

+        &self,
+        plan: PreparedAccessPlan,
+        file_metadata: &ParquetMetaData,
+        _arrow_schema: &Schema,


It's also smelly that we have 2 implementations and the arguments / signatures already don't match.

adriangb · 2026-05-14T23:06:39Z

 }

+/// Find a `DynamicFilterPhysicalExpr` in the expression tree.
+fn find_dynamic_filter(


Is it safe to just blindly recurse like this? What if the expression is something like NOT (dynamic_filter) or whatever?

adriangb · 2026-05-14T23:07:49Z

+    ///
+    /// Sort options indicate the sort direction for each child expression,
+    /// enabling downstream consumers (e.g., parquet readers) to reorder
+    /// row groups by statistics for TopK queries.


Again not a fan of this way of passing information around.

adriangb · 2026-05-14T23:18:03Z

I wonder if this belongs on try_pushdown_sort rather than on DynamicFilterPhysicalExpr. The PushdownSort rule already visits SortExec → DataSourceExec (TopK or not) and forwards the required ordering to the source via try_pushdown_sort(order, eq_properties). Today parquet's implementation only accepts when the request matches the file's declared natural ordering (or its reverse). The case this PR is targeting — request doesn't match declared ordering (or the table declares none), but the source can still reorder row groups/files by min/max stats — fits the existing Inexact contract: keep the SortExec, but feed it data in a better order. The rule already preserves sort_exec.fetch() on the rebuilt SortExec, so TopK keeps working.

Concretely: in ParquetSource::try_pushdown_sort, after the existing Exact/reverse-Inexact checks fall through, instead of returning Unsupported, check whether the requested sort column exists in the file schema (the same check the opener fallback is already doing) and return Inexact with sort_order_for_reorder set. The opener already prefers sort_order_for_reorder over the dynamic filter — that whole branch in opener.rs (the find_dynamic_filter / find_column_in_expr fallback) can go away, along with the sort_options/fetch fields on DynamicFilterPhysicalExpr and the reordering of with_fetch to call create_filter after setting fetch.

A few reasons I think this factoring is cleaner:

DynamicFilterPhysicalExpr's job is "live threshold for runtime pruning." Adding sort-direction and K fields to it conflates that with "how should the source schedule its reads." try_pushdown_sort is the channel that already means the latter.
The reorder is useful even without a LIMIT — less sorting work and better locality help any ORDER BY. Routing through the dynamic filter ties the optimization to TopK; routing through sort pushdown makes it apply to any SortExec → DataSource shape.
The opener never actually reads df.fetch() — only sort_options and the child column — so the only piece of metadata being moved is one the sort-pushdown call already passes as its order argument.
Fewer places to maintain: no proto-roundtrip carve-out, no find_dynamic_filter tree walk under AND/wrappers (which is the kind of thing that quietly breaks the next time someone wraps the predicate), and no ordering coupling between with_fetch and create_filter.

Happy to be wrong here — is there a case I'm missing where the sort metadata can reach the parquet source through the dynamic filter but not through try_pushdown_sort?

zhuqi-lucas · 2026-05-15T03:32:17Z

Thanks @adriangb, you're right. Routing through try_pushdown_sort matches what that channel is for, decouples the reorder from TopK (so it helps any ORDER BY on a sorted source, not just LIMIT-K), and lets the sort_options/fetch fields on DynamicFilterPhysicalExpr go away — along with the find_dynamic_filter walk under AND/wrappers and the with_fetch/create_filter ordering coupling, which were the parts I was least happy about anyway.

Can't think of a case where the sort metadata reaches parquet through the dynamic filter but not through try_pushdown_sort — same SortExec → DataSource shape, dynamic filter just happens to be what SortExec attaches to its predicate.

Let me refactor along the lines you sketched.

When a parquet file has multiple row groups with out-of-order or overlapping statistics, TopK queries benefit from reading "best" row groups first so the dynamic filter threshold tightens quickly. This PR adds: 1. `reorder_by_statistics`: sorts row groups by min values (ASC) based on parquet column statistics. Direction (DESC) is handled by the existing `reverse()` applied after reorder. The two steps compose: - Sorted data: reorder is a no-op, reverse gives perfect DESC order - Unsorted data: reorder fixes the order, reverse flips for DESC 2. `AccessPlanOptimizer` trait: extensible interface for row group access plan optimizations (reorder, reverse) applied after pruning. 3. `DynamicFilterPhysicalExpr.sort_options/fetch`: SortExec now passes sort direction and fetch limit to the dynamic filter, enabling the parquet reader to determine reorder direction for any TopK query. 4. `FileSource::reorder_files`: file-level reordering in the shared work queue so multi-file TopK reads the most promising files first. 5. Fix `SortExec::with_fetch` ordering: fetch must be set before `create_filter()` so the DynamicFilter gets the correct K value.

Adds an overlap guard to `PreparedAccessPlan::reorder_by_statistics`: when sorted-by-min adjacent row-group `[min, max]` ranges overlap above 50%, reordering cannot enable RG-level pruning (every "later" RG still has values that could appear in TopK results) so the reorder cost — CPU sort, lost IO sequential locality, and parallel scheduling pessimization across workers all pulling "best" RGs first — dominates, producing a net regression. This addresses the ClickBench `hits_partitioned` regressions (Q24, Q25, Q26: 1.11x-1.44x slower) where `EventTime` and `SearchPhrase` are uniformly distributed across row groups via the user-id hash partitioning, so RG min/max ranges fully overlap and reorder cannot help. Sorted / non-overlapping data continues to take the reorder path — verified by `sort_pushdown.slt` Test H still passing. The overlap helper treats null mins/maxes (RGs without statistics) conservatively as overlaps, so missing stats discourage rather than silently disable the guard. Adds 7 unit tests covering: fully disjoint, disjoint after reorder, fully overlapping, partial overlap, null max in previous, null min in next, and the n<2 edge case.

Upstream `proto: serialize and dedupe dynamic filters v2 (apache#21807)` added a new `DynamicFilterPhysicalExpr::from_parts` constructor that builds the struct via a `Self { ... }` literal. After this PR's rebase brings that commit in, the new initializer is missing the `sort_options` and `fetch` fields this PR adds, breaking the build (E0063). Default both to `None` in `from_parts` — the proto wire format does not yet carry these fields, so reconstruction cannot recover them. Callers that need sort metadata still go through `new_with_sort_options`. Proto round-trip tests (162 in `datafusion-proto`) continue to pass.

Mirrors the row-group-level overlap guard at the file level. When adjacent file `[min, max]` ranges (in sorted-by-min order) overlap above 50%, file-level reorder cannot enable file pruning and the reorder cost (CPU sort + lost IO sequential locality + parallel work-stealing pessimization) dominates. This addresses the remaining ClickBench `hits_partitioned` regressions (Q24, Q25, Q26 still 1.10x-1.31x slower after the RG-level guard landed). `hits_partitioned` is partitioned by user-id hash, so each file covers the full range of `EventTime` and `SearchPhrase` — file ranges fully overlap and reorder cannot help. Files lacking statistics are conservatively counted as overlaps so missing stats discourage rather than silently disable the guard. Adds 6 unit tests (disjoint sorted, disjoint after reorder, fully overlapping, partial below threshold, missing stats, n<2).

@adriangb

Per @adriangb's review on PR apache#21956, the sort-direction + fetch metadata that drives parquet row-group reorder belongs on the sort-pushdown channel that already exists, not bolted onto `DynamicFilterPhysicalExpr` whose job is runtime threshold pruning. Concretely: - `ParquetSource::try_pushdown_sort` grows a third branch. After the existing Exact and reverse-Inexact checks fall through, if the requested sort column is a plain `Column` present in the file schema, the source now returns `Inexact` with `sort_order_for_reorder` set (and `reverse_row_groups` for DESC), instead of `Unsupported`. The reorder benefit now applies to any `ORDER BY` on a sorted source, not just TopK. - `parquet/opener.rs` drops the `find_dynamic_filter` / `find_column_in_expr` walk under AND/wrappers and the `df.sort_options()` extraction. Both reorder direction and reverse-pass activation come from the source config alone. `is_descending` collapses to just `prepared.reverse_row_groups`. - `ParquetSource::extract_topk_sort_info` (used by `reorder_files` in the shared work queue) reads `sort_order_for_reorder` directly, no dynamic-filter fallback. - `DynamicFilterPhysicalExpr` loses its `sort_options` field, `fetch` field, `new_with_sort_options` constructor, and both getters. The proto round-trip carve-out (`from_parts` defaulting these to None) goes away. - `SortExec::create_filter` reverts to `DynamicFilterPhysicalExpr::new`. `SortExec::with_fetch` reverts to building the filter from `self.create_filter()` before assigning `new_sort.fetch = fetch` — the ordering coupling introduced earlier is no longer needed because `create_filter` no longer reads `fetch`. Tests: - Two existing pushdown_sort tests (`test_no_pushdown_for_non_reverse_sort`, `test_no_prefix_match_longer_than_source`) renamed and re-snapshotted to reflect the new `Inexact` outcome — pushdown now succeeds even when neither natural nor reversed ordering matches, as long as the sort column is in the file schema. The surrounding `SortExec` still stays in place; only the source picks up the reorder hint. - Four new unit tests in `source.rs` for the new branch: - column in schema + ASC -> Inexact, sort_order_for_reorder set, reverse_row_groups not set - column in schema + DESC -> Inexact, both set - non-Column sort expression (e.g. `a + 1`) -> Unsupported - column not in file schema -> Unsupported Test results: datasource-parquet (116 + 4 passed), physical-plan sorts (60 passed), core_integration physical_optimizer (454 passed). clippy --all-targets -D warnings clean. cargo fmt clean.

…ORDER BY ... DESC LIMIT` shapes After routing the stats-based RG reorder through `try_pushdown_sort` (commit before this), the new `Inexact` branch trips whenever the requested sort column is in the file schema — not only when the dynamic filter path could derive sort direction. As a result, more `ORDER BY col DESC LIMIT N` queries on parquet sources without a declared natural ordering now emit `reverse_row_groups=true` on the `DataSourceExec` line. This is the behavioural change Adrian's review intended: the optimisation applies to any `ORDER BY` shape that has a column-in-schema basis for the reorder, not just TopK queries with a dynamic filter. The plan results are unchanged; only the `reverse_row_groups` decorator appears in more EXPLAIN outputs. Regenerated with `cargo test --test sqllogictests -- ... --complete`, then verified the suites pass without --complete.

Copilot AI review requested due to automatic review settings April 30, 2026 09:23

github-actions Bot added physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt) datasource Changes to the datasource crate physical-plan Changes to the physical-plan crate labels Apr 30, 2026

Copilot started reviewing on behalf of zhuqi-lucas April 30, 2026 09:24 View session

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch from 864a3d3 to 6e56cae Compare April 30, 2026 09:28

Copilot AI reviewed Apr 30, 2026

View reviewed changes

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch from 6e56cae to f0f4058 Compare April 30, 2026 09:34

zhuqi-lucas marked this pull request as draft April 30, 2026 09:36

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch 3 times, most recently from 587297d to 235c4e1 Compare April 30, 2026 09:46

zhuqi-lucas changed the title ~~feat: reorder row groups by statistics for TopK queries~~ feat: reorder files and row groups by statistics for TopK queries Apr 30, 2026

zhuqi-lucas changed the title ~~feat: reorder files and row groups by statistics for TopK queries~~ feat: globally reorder files and row groups by statistics for TopK queries Apr 30, 2026

zhuqi-lucas marked this pull request as ready for review April 30, 2026 09:47

zhuqi-lucas mentioned this pull request Apr 30, 2026

feat: statistics-driven TopK optimization for parquet (file reorder + RG reorder + threshold init + cumulative prune) #21580

Open

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch from 235c4e1 to 0c9c3d8 Compare April 30, 2026 10:02

github-actions Bot added the core Core DataFusion crate label Apr 30, 2026

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch from 0c9c3d8 to f7d9156 Compare April 30, 2026 10:13

github-actions Bot added the auto detected api change Auto detected API change label May 7, 2026

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch from c8fd321 to ab09242 Compare May 7, 2026 07:55

github-actions Bot removed the auto detected api change Auto detected API change label May 7, 2026

adriangb reviewed May 14, 2026

View reviewed changes

zhuqi-lucas added 5 commits May 15, 2026 12:08

zhuqi-lucas force-pushed the feat/rg-reorder-by-statistics branch from 6987aa2 to 45d775a Compare May 15, 2026 04:17

github-actions Bot removed physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate labels May 15, 2026

		// 1. reorder_by_statistics: sort RGs by min values (ASC) to align
		// with the file's declared output ordering. This fixes out-of-order

Conversation

zhuqi-lucas commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhuqi-lucas commented Apr 30, 2026

Uh oh!

adriangbot commented Apr 30, 2026

Uh oh!

adriangbot commented Apr 30, 2026

Uh oh!

zhuqi-lucas commented Apr 30, 2026

Uh oh!

alamb commented May 4, 2026

Uh oh!

alamb commented May 4, 2026

Uh oh!

adriangbot commented May 4, 2026

Uh oh!

adriangbot commented May 4, 2026

Uh oh!

zhuqi-lucas commented May 5, 2026

Uh oh!

alamb commented May 6, 2026

Uh oh!

alamb commented May 6, 2026

Uh oh!

adriangbot commented May 6, 2026

Uh oh!

adriangbot commented May 6, 2026

Uh oh!

zhuqi-lucas commented May 7, 2026

Uh oh!

zhuqi-lucas commented May 7, 2026

Uh oh!

adriangbot commented May 7, 2026

Uh oh!

adriangbot commented May 7, 2026

Uh oh!

zhuqi-lucas commented May 7, 2026

Uh oh!

adriangbot commented May 7, 2026

Uh oh!

adriangbot commented May 7, 2026

Uh oh!

zhuqi-lucas commented May 7, 2026

Uh oh!

adriangbot commented May 7, 2026

Uh oh!

adriangbot commented May 7, 2026

Uh oh!

zhuqi-lucas commented May 7, 2026

Uh oh!

adriangb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas commented Apr 30, 2026 •

edited

Loading