Skip to content

Support DISTINCT ON with aggregation and windows#22169

Open
kumarUjjawal wants to merge 1 commit into
apache:mainfrom
kumarUjjawal:feat/distinct_on
Open

Support DISTINCT ON with aggregation and windows#22169
kumarUjjawal wants to merge 1 commit into
apache:mainfrom
kumarUjjawal:feat/distinct_on

Conversation

@kumarUjjawal
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

DataFusion currently rejects DISTINCT ON queries when they are combined with GROUP BY, aggregate functions, or window functions.

PostgreSQL allows these queries. The planner already builds the aggregate and window plan before applying DISTINCT ON, but the old DISTINCT ON path only worked against the pre-aggregation input. That meant expressions that depended on aggregate or window output could not be planned.

What changes are included in this PR?

This PR updates DISTINCT ON planning so its expressions participate in the same aggregate and window rewrite pipeline as SELECT, HAVING, QUALIFY, and ORDER BY.

It also keeps hidden DISTINCT ON keys and ORDER BY tie-breakers in scope before the final projection, so valid PostgreSQL-style queries work even when those expressions are not in the select list.

The change also handles SELECT alias resolution for DISTINCT ON and ORDER BY in the PostgreSQL-compatible way: a bare alias can resolve to the select expression, while the same name inside a larger expression still resolves as an input column.

Are these changes tested?

Yes

Are there any user-facing changes?

No public API Change

@github-actions github-actions Bot added sql SQL Planner sqllogictest SQL Logic Tests (.slt) labels May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DISTINCT ON on queries with aggregations?

1 participant