-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
While working on #11250, I noticed that the current handling of operators is quite scattered and complex. This is the root cause of why operator diagnostic messages are less user-friendly than those of scalar functions, and it may creates maintenance hurdles.
For example:
- For
SELECT 2.0 << 3.5;, constant folding occurs during the optimization phase, the type mismatch is caught inevaluate_with_resolved_args. - But for
SELECT 1 + 'a';, the error is caught duringprojection_schemain the LogicalPlan phase.: - And for
SELECT a << b FROM ...;, the error not be surfaced until the physical execution phase inevaluate_expressions_to_arrays_with_metrics.
Describe the solution you'd like
I propose that we gradually refactor operator handling by bringing it under the same framework as scalar functions.
In my initial thinking, the roadmap is like this:
- Start by defining signatures for simple binary operators. This allows us to reuse the existing function-based type coercion logic during LogicalPlan generation.
- Implement rewrite rules to gradually transform these operators into function-based calls.
- Extend this to more unary and binary operations.
- Finally, address operators like LIKE, which have unique syntax or optimization paths compared to standard binary operators.
Describe alternatives you've considered
No response
Additional context
What do the maintainers think about this direction? I would love to hear your thoughts on whether this unification aligns with the long-term vision of DataFusion, or suggestions on task decomposition and how to best phase this refactor.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request