Skip to content

fix: preserve Inexact precision in Statistics#22146

Open
timsaucer wants to merge 2 commits into
apache:mainfrom
timsaucer:fix/stats-with-fetch-inexact-zero
Open

fix: preserve Inexact precision in Statistics#22146
timsaucer wants to merge 2 commits into
apache:mainfrom
timsaucer:fix/stats-with-fetch-inexact-zero

Conversation

@timsaucer
Copy link
Copy Markdown
Member

@timsaucer timsaucer commented May 13, 2026

Which issue does this PR close?

No issue reported

Rationale for this change

Statistics::with_fetch unconditionally returned Precision::Exact(0) when the input had nr <= skip, even when the input was Inexact(nr) — promoting an estimated upper bound into an exact zero. The exactness flag then misleads downstream consumers (notably AggregateStatistics via Count::value_from_stats) into trusting a derived "0" and folding the count subtree to a literal.

Concrete user-visible symptom reported on TPC-H Q22:

let df = ctx.sql(q22_sql).await?;
df.clone().show().await?;   // prints 7 rows (correct)
df.count().await?;          // returns 0 (wrong)

EXPLAIN for the count plan shows the outer count aggregate collapsed to ProjectionExec([lit(0)]) -> PlaceholderRowExec.

After PR #21240 left uncorrelated scalar subqueries in the filter rather than rewriting them to joins, FilterExec can't use interval analysis on ScalarSubqueryExpr, falls back to the 20% default selectivity, and produces a small Inexact row estimate. A LeftAnti join whose estimated semi-overlap covers the outer estimate then yields Inexact(0). That zero propagates through grouped aggregates whose estimate_num_rows returns the child stats unchanged when value == 0. The pre-existing with_fetch bug on a downstream SortExec finally promotes it to Exact(0), which AggregateStatistics trusts.

The root cause is the precision promotion in with_fetch. The PR fixes that; the surrounding plan-shape changes after #21240 just made it reachable.

What changes are included in this PR?

  • Statistics::with_fetch: when nr <= skip, preserve the exactness of the input via check_num_rows(Some(0), self.num_rows.is_exact().unwrap()) instead of always returning Exact(0).
  • datafusion-common: new unit test test_with_fetch_skip_all_rows_inexact pinning the new behaviour.
  • datafusion-physical-plan: update the existing test_row_number_statistics_for_global_limit expectation that encoded the old (incorrect) promotion to expect Inexact(0) now.
  • datafusion/sqllogictest/test_files/subquery.slt: SLT regression test reproducing the user-visible count(*) symptom over a query that contains a scalar subquery, not exists, and a group-by on a derived column, backed by parquet sources so the data sources report Exact statistics.

Are these changes tested?

Yes:

  • Unit test in datafusion-common for the precision-preserving behaviour.
  • Updated unit test in datafusion-physical-plan/limit.rs.
  • SLT regression test in subquery.slt that fails without the fix (count(*) returns 0 instead of 2) and passes with it.

Are there any user-facing changes?

Only the bug fix. No public API changes.

…skip

When `Statistics::with_fetch` encountered an input `Inexact(nr)` with
`nr <= skip`, it unconditionally returned `Precision::Exact(0)`,
promoting an estimated upper bound into an exact zero.

This silently caused `AggregateStatistics` to fold an outer
`count(*)` over a sub-tree whose row estimate happened to reach zero
into a literal `0` backed by `PlaceholderRowExec`. With apache#21240 leaving
uncorrelated scalar subqueries in the filter, `FilterExec` falls back
to the default 20% selectivity (the interval analyzer cannot reason
about `ScalarSubqueryExpr`) and `LeftAnti` cardinality estimation can
produce `Inexact(0)`, which `SortExec`/`with_fetch` then turned into
`Exact(0)` -- and `Count::value_from_stats` trusted it.

Preserve the exactness of the input estimate so estimated zeros stay
inexact and downstream consumers can no longer skip real execution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) common Related to common crate physical-plan Changes to the physical-plan crate labels May 13, 2026
@timsaucer timsaucer changed the title fix: preserve Inexact precision in Statistics::with_fetch when nr <= skip fix: preserve Inexact precision in Statistics May 13, 2026
@timsaucer
Copy link
Copy Markdown
Member Author

@neilconway would you mind reviewing? It wasn't caused by #21240 but it was exposed by it.

Copy link
Copy Markdown
Contributor

@neilconway neilconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me! Nice work on the writeup and the test cases.

Comment thread datafusion/common/src/stats.rs Outdated
@timsaucer timsaucer requested a review from mbutrovich May 13, 2026 16:29
Copy link
Copy Markdown
Member

@asolimando asolimando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the promotion to Exact was indeed incorrect in this case, thanks for catching that and proposing a fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants