Skip to content

perf: skip_stats backport#15

Closed
radustoenescu wants to merge 2 commits intohstack:mainfrom
radustoenescu:skip-stats-rebased
Closed

perf: skip_stats backport#15
radustoenescu wants to merge 2 commits intohstack:mainfrom
radustoenescu:skip-stats-rebased

Conversation

@radustoenescu
Copy link
Copy Markdown
Collaborator

@radustoenescu radustoenescu commented Apr 9, 2026

Summary

  • perf: backport skip_stats behavior from delta-kernel 0.20 to skip stats_parsed checkpoint deserialization for wide-table snapshot speedups (with documented trade-off that limit-based file pruning becomes a no-op when stats are absent)
  • fix: harden read_adds_size against overflow/invalid values by removing panic paths and clamping negative sums before usize cast
  • cleanup: remove unused import(s)

aditanase and others added 2 commits April 9, 2026 15:34
Signed-off-by: Adrian Tanase <atanase@adobe.com>
…size

- Remove unused StructArray import
- Replace unwrap() with ok().flatten() on sum_array_checked to avoid
  panic on arithmetic overflow
- Clamp negative i64 to 0 before casting to usize
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 9, 2026

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@radustoenescu radustoenescu changed the title [HSTACK] skip_stats backport + correctness/perf fixes perf: skip_stats backport and correctness/perf improvements Apr 9, 2026
@radustoenescu radustoenescu force-pushed the skip-stats-rebased branch 3 times, most recently from ed354ef to b636501 Compare April 21, 2026 07:18
@radustoenescu radustoenescu changed the title perf: skip_stats backport and correctness/perf improvements perf: skip_stats backport Apr 21, 2026
predicate: Option<PredicateRef>,
) -> SendableRBStream {
let scan = match self.scan_builder().with_predicate(predicate).build() {
let scan = match self.scan_builder().with_predicate(predicate).with_skip_stats(true).build() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should byte the bullet and add a config option on the TableConfig so we can change this at runtime. Fine to have the default on false but maybe some queries end up benefitting from stats

fn read_adds_size(array: &dyn ProvidesColumnByName) -> usize {
if let Some(size) = ex::extract_and_cast_opt::<Int64Array>(array, "size") {
sum_array_checked::<arrow::array::types::Int64Type, _>(size).unwrap().unwrap_or_default() as usize
sum_array_checked::<arrow::array::types::Int64Type, _>(size)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please find the originating commit and try a fixup on it + force push once this PR gets in, that will make the rebase process simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants