Replace ANY/ALL CASE planning with array_has/min/max desugaring#22102
Replace ANY/ALL CASE planning with array_has/min/max desugaring#22102cetra3 wants to merge 2 commits into
Conversation
|
FYI @buraksenn @berkaysynnada and @Jefffrey Perhaps you can help review this PR as you helped review #21743 |
Jefffrey
left a comment
There was a problem hiding this comment.
So do we have a comprehensive view of how empty haystacks/null haystacks/haystacks containing nulls/null needles look with any/all and all supported operators with this PR?
I've lost track a bit of how the behaviour has evolved over the PRs:
So I want to ensure we have a clear understanding of the final behaviour we're agreeing on, since this PR is fixing the any = behaviour to what it previously was and hopefully aligning the other operators (and all) to similar behaviour it seems?
|
So I am basing this PR relative to the latest tagged release, which is For this one shape, this PR adjusts the behaviour back to But also, in this PR, I have tried to make simple rules about what expressions get desugared to. Essentially all non-nullable behaviour matches semantics like you'd expect, it's just weird edge cases around some expressions with null values that diverge. I'm open to expanding/adjusting this, as long as it doesn't impact the existing use case ( Here's some example desugaring:
Cardinality CheckThe cardinality check is there to deal with empty haystacks and ensuring we return a boolean true/false rather than null. If we desugared to Here's a table:
PostgreSQL DivergenceIt's when you start mixing in null values to needles and haystacks that things diverge from other systems. Each of these functions treat
|
8b46e1e to
1753037
Compare
Which issue does this PR close?
= ANY([literals])#22073Rationale for this change
This partially reverts the changes in PR #21743 but keeps the cardinality when desugaring to
array_minandarray_maxvalues.This aligns more with the outputs from the existing datafusion functions, rather than going down the path of having full on PostgreSQL null semantics.
What changes are included in this PR?
Adjusts how we desugar certain queries such as
> ANYetc.. rather than using a full chain, we use a simplified version that just checks the cardinality first and combines witharray_min/array_maxoperatorsI.e,
Desugars to:
Which get simplified to:
Are these changes tested?
Yes they are tested
Are there any user-facing changes?
Yes, there are some changes to the output of some queries.
However these changes were not shipped as part of
53.1.0, and are only onmain