perf: reuse mask in truncate_list_nulls and avoid counting all true bits#22158
perf: reuse mask in truncate_list_nulls and avoid counting all true bits#22158rluvaton wants to merge 2 commits into
truncate_list_nulls and avoid counting all true bits#22158Conversation
|
run benchmark tpch tpcds |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing avoid-calc-number-of-set-bits (cc04caf) to 0c4ace8 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing avoid-calc-number-of-set-bits (cc04caf) to 0c4ace8 (merge-base) diff using: tpcds File an issue against this benchmark runner |
| let null_and_non_empty = &!nulls.inner() & not_empty.values(); | ||
| let empty = arrow::compute::kernels::cmp::eq(&lengths, zero)?; | ||
| let valid_or_empty = empty.values() | nulls.inner(); | ||
| let valid_or_empty = BooleanArray::from(valid_or_empty); |
There was a problem hiding this comment.
I was thinking can we optimize even more and avoid allocating BooleanArray here?
There was a problem hiding this comment.
BooleanArray::has_false used below I believe has no performatic BooleanBuffer equivalent: .iter().any(|v| !v) checks bit for bit (as a future improvement, the non-null branch could be moved to BooleanBuffer). But that BooleanArray::from BooleanBuffer should be cheap anyway.
There was a problem hiding this comment.
IMO This is really premature optimization, BooleanArray allocation is super cheap
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
Which issue does this PR close?
N/A
Rationale for this change
BooleanArraythat check if there is at least 1 set bitBooleanBufferbitwise operations and reuse maskWhat changes are included in this PR?
reused mask, and use helper to check if at least one false
Are these changes tested?
Existing tests
Are there any user-facing changes?
No
Cc @gstvg, @comphead