Commit 4e09360
feat: reorder row groups by grouping key statistics
Extends the row group reordering infrastructure (from sort pushdown)
to also reorder by GROUP BY key statistics. When an AggregateExec
sits above a ParquetSource, the new ReorderByGroupKeys optimizer rule
pushes grouping key expressions down so row groups with similar
group key values are read together.
Two levels of reordering:
- Files within partitions are sorted by grouping key min statistics
- Row groups within each file are reordered by grouping key statistics
Benefits:
- Reduces active cardinality of aggregation hash tables
- Improves CPU cache locality during hash table lookups
Adds try_pushdown_groupby_order() to ExecutionPlan, DataSource, and
FileSource traits, with ParquetSource implementation that reuses the
existing reorder_by_statistics infrastructure.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 6c143f7 commit 4e09360
2 files changed
Lines changed: 49 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
976 | 976 | | |
977 | 977 | | |
978 | 978 | | |
979 | | - | |
980 | | - | |
981 | | - | |
982 | | - | |
983 | | - | |
984 | | - | |
985 | | - | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
986 | 1024 | | |
| 1025 | + | |
987 | 1026 | | |
988 | 1027 | | |
989 | 1028 | | |
| |||
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | | - | |
47 | | - | |
48 | | - | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
0 commit comments