-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: optimize CASE WHEN for divide-by-zero protection pattern #19994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: optimize CASE WHEN for divide-by-zero protection pattern #19994
Conversation
267d8df to
334785f
Compare
|
run benchmark tpcds |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
Hi @andygrove , |
|
run benchmark tpcds |
|
Also FYI @pepijnve |
|
🤖 |
|
It might be useful to add a microbenchmark to https://github.com/apache/datafusion/blob/main/datafusion/physical-expr/benches/case_when.rs so we can compare before/after. I don't think any of the existing ones will cover this particular pattern. The TCP-DS queries that contain this pattern (based on |
|
🤖: Benchmark completed Details
|
- Adds a specialization for the common pattern: CASE WHEN y > 0 THEN x / y ELSE NULL END - Add EvalMethod::DivideByZeroProtection variant - Add pattern detection in find_best_eval_method() - Implement safe_divide using Arrow kernels - Handle CastExpr wrapping on divisor
334785f to
6479884
Compare
|
Microbenchmark result:
|
|
Hi @pepijnve , thanks for pointing out the need for a microbenchmark. I compared |
| } | ||
| } | ||
|
|
||
| fn benchmark_divide_by_zero_protection(c: &mut Criterion, batch_size: usize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please pull this additional benchmark code code into its own PR we could then use our benchmarking scripts to compare performance of this PR with main?
Thank you 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| }, | ||
| ); | ||
|
|
||
| group.bench_function( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to spot the difference between this benchmark and the one just above it, they look identical to me. Yet the measurements seem to produce similar values. What causes that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pepijnve ,
The key difference is in which column is used in the WHEN condition:
- First benchmark (
DivideByZeroProtection): checks divisor_col > 0, and since divisor_col is also the divisor in numerator / divisor_col, this matches the pattern and triggers the optimization. - Second benchmark (
ExpressionOrExpression): checks divisor_copy_col > 0, but the division uses divisor_col. Since the checked column doesn't match the divisor, the optimization is not triggered and it falls back to ExpressionOrExpression.
I'll also add a comment in the code to make this distinction clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah of course. That's really subtle. A comment for future readers is indeed a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A followup question I had, but I'm happy to measure that myself, is what's causing the performance gap? Might be interesting to see if there's something we could do to improve ExprOrExpr for this particular pattern.
|
Benchmark was added in #20076 |
## Which issue does this PR close? Related to #19994 - This PR extracts the benchmark code to allow performance comparison. ## Rationale for this change As pointed out by @alamb in #19994, this separates the microbenchmark code so that the benchmarking scripts can compare the optimization PR against main with the benchmark already in place. ## What changes are included in this PR? Adds a microbenchmark for the divide-by-zero protection pattern in `case_when.rs`: - Benchmarks with varying percentages of zeros (0%, 10%, 50%, 90%) - Compares `DivideByZeroProtection` pattern (where checked column matches divisor) vs `ExpressionOrExpression` fallback (where they don't match) ## Are these changes tested? benchmark code only. ## Are there any user-facing changes? No.
|
I merged up to get the new benchmarks so I could run them on our runner |
|
run benchmark case_when |
This comment was marked as outdated.
This comment was marked as outdated.
|
run benchmark case_when |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
Rationale for this change
The
CaseExprimplementation is expensive. A common usage pattern (particularly in TPC-DS benchmarks) is to protect against divide-by-zero:This entire expression can be replaced with a simpler divide operation that returns NULL when the divisor is zero, avoiding the overhead of full CASE evaluation.
What changes are included in this PR?
Are these changes tested?
Yes, added two new tests:
Are there any user-facing changes?
No. This is an internal optimization that produces the same results but with better performance.