Skip to content

test: worker correctness tests + hand-crafted engine vs ArroyoSketch comparison#232

Merged
milindsrivastava1997 merged 6 commits intomainfrom
pr/E-correctness-tests-comparison
Apr 7, 2026
Merged

test: worker correctness tests + hand-crafted engine vs ArroyoSketch comparison#232
milindsrivastava1997 merged 6 commits intomainfrom
pr/E-correctness-tests-comparison

Conversation

@zzylol
Copy link
Copy Markdown
Contributor

@zzylol zzylol commented Mar 25, 2026

Summary

  • Adds CapturingOutputSink test utility that captures precomputed outputs in-memory
  • Adds worker correctness unit tests covering Sum and KLL accumulator pipelines
  • Adds comparison test asserting that the hand-crafted precompute engine produces the same output as ArrovoSketch accumulator results — covers Sum and KLL (this is the key parity test between the two implementations)
  • Adds test that loads aggregation config from a real streaming_config YAML
  • Expands design doc with sections on sliding window behaviour, accumulator lifecycle, load balancing trade-offs, and fault tolerance

This is PR E of 6 stacked PRs splitting #162

Stacking order:

  • PR A → main: Core engine (merge first)
  • PR B → PR A: E2E tests + sliding window fix
  • PR C → PR B: Multi-connector ingest + pane-based sliding window
  • PR D → PR C: Benchmarks + scalability tests
  • PR E (this) → PR D: Correctness tests + hand-crafted vs ArrovoSketch comparison
  • PR F → PR E: Refactoring + benchmarking binary

Test plan

  • cargo test -p query_engine_rust precompute passes all worker tests
  • Comparison test (test_precompute_engine_matches_arroyo_sum, test_precompute_engine_matches_arroyo_kll) passes

🤖 Generated with Claude Code

@zzylol zzylol force-pushed the pr/D-benchmarks-scalability branch from 4c740c5 to a145bd7 Compare March 25, 2026 11:32
@zzylol zzylol force-pushed the pr/E-correctness-tests-comparison branch from 8c04bc5 to 47d279e Compare March 25, 2026 11:32
@zzylol zzylol force-pushed the pr/D-benchmarks-scalability branch from a145bd7 to 1648f1e Compare March 31, 2026 20:30
@zzylol zzylol changed the title test: worker correctness tests + hand-crafted engine vs ArrovoSketch comparison test: worker correctness tests + hand-crafted engine vs ArroyoSketch comparison Apr 1, 2026
Base automatically changed from pr/D-benchmarks-scalability to main April 6, 2026 20:22
@zzylol zzylol force-pushed the pr/E-correctness-tests-comparison branch from 60d90f6 to 443c7e0 Compare April 6, 2026 21:48
…comparison

- Add CapturingOutputSink test utility for capturing precomputed outputs in-memory
- Add worker correctness unit tests covering Sum and KLL accumulator pipelines
- Add comparison test asserting precompute engine matches ArroyoSketch results
- Add test that loads aggregation config from real streaming_config YAML
- Expand design doc with sliding window behaviour, accumulator lifecycle,
  load balancing trade-offs, and fault tolerance sections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zzylol zzylol force-pushed the pr/E-correctness-tests-comparison branch from 443c7e0 to 31906ef Compare April 6, 2026 21:49
zz_y and others added 5 commits April 6, 2026 16:51
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Revert broken streaming_engine conditional in precomputed_output.rs
  (variable doesn't exist in scope)
- Replace .inner.sketch.get_n() with .inner.count() in KLL comparison
  test (KllSketch no longer has a .sketch field)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- create_precompute_from_bytes is only called from the arroyo
  deserialization path, so use deserialize_from_bytes_arroyo
  for MultipleSumAccumulator
- Inline short assert_eq for KLL count comparison

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…uild

The sketchlib-rust repo (ProjectASAP/sketchlib-rust) renamed its
package from "sketchlib-rust" to "asap_sketchlib" on main. Without
a pinned rev, Cargo fails to find the package by name. Pin to the
same rev (4404274) used by asap-query-engine/Cargo.toml where the
package is still named "sketchlib-rust".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@milindsrivastava1997 milindsrivastava1997 merged commit a3894c1 into main Apr 7, 2026
15 checks passed
@milindsrivastava1997 milindsrivastava1997 deleted the pr/E-correctness-tests-comparison branch April 7, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants