Compact more aggressively in TopK based upon memory usage#20381
Conversation
|
run benchmarks |
|
Hi @cetra3 , |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (d805436) to a737c27 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (d805436) to a737c27 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (d805436) to a737c27 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
Benchmark for this request hit the 7200s job deadline before finishing. Benchmarks requested: Kubernetes messageFile an issue against this benchmark runner |
|
@bharath-techie I had replied to a comment about simplifying this further:
If you're OK with this I will do the change |
|
@adriangb @Dandandan @alamb @kosiew sorry for the spam, this PR helps a lot in |
|
I've updated to |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (dff9381) to 8741a77 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (dff9381) to 8741a77 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (dff9381) to 8741a77 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
I don't see a broad runtime win in these benchmark suites; the aggregate timings are essentially neutral/mixed. |
|
run benchmarks env:
DATAFUSION_RUNTIME_MEMORY_LIMIT: '4Gi' |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (dff9381) to 8741a77 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (dff9381) to 8741a77 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (dff9381) to 8741a77 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
run benchmarks env:
DATAFUSION_RUNTIME_MEMORY_LIMIT: '4G' |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (1ab6aea) to 7f2f78d (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (1ab6aea) to 7f2f78d (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing topk_memory_batch (1ab6aea) to 7f2f78d (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
Benchmark for this request hit the 7200s job deadline before finishing. Benchmarks requested: Kubernetes messageFile an issue against this benchmark runner |
Which issue does this PR close?
Sort of addresses #19386
Rationale for this change
Compaction in TopK values currently uses some hard set heuristics to decide when to compact. Instead we can use the memory size of the batches as a bound.
What changes are included in this PR?
Adjusts TopK compaction to compact more aggressively, based upon memory size.
Are these changes tested?
Yes and a test has been added.
Are there any user-facing changes?
No
Benchmarks
I'm struggling (in this PR and other PRs...) to get some good reliable benchmarks through.
However with this one it seems like it's mostly the same speed as
mainor a little faster in some cases: