Skip to content

feat: multi-connector ingest (Prometheus + VictoriaMetrics) + pane-based sliding window#230

Merged
milindsrivastava1997 merged 20 commits intomainfrom
pr/C-multi-connector-pane-sliding-window
Apr 6, 2026
Merged

feat: multi-connector ingest (Prometheus + VictoriaMetrics) + pane-based sliding window#230
milindsrivastava1997 merged 20 commits intomainfrom
pr/C-multi-connector-pane-sliding-window

Conversation

@zzylol
Copy link
Copy Markdown
Contributor

@zzylol zzylol commented Mar 25, 2026

Summary

  • Adds multi-connector ingest support: Prometheus remote write and VictoriaMetrics remote write can both feed the precompute engine simultaneously
  • Replaces naive sliding window with pane-based incremental computation — each pane is computed once and reused across overlapping windows
  • Adds design doc section on late data handling for closed windows
  • Clarifies single-machine multi-threaded architecture in design doc

This is PR C of 6 stacked PRs splitting #162

Stacking order:

  • PR A → main: Core engine (merge first)
  • PR B → PR A: E2E tests + sliding window fix (merge second)
  • PR C (this) → PR B: Multi-connector ingest + pane-based sliding window
  • PR D → PR C: Benchmarks + scalability tests
  • PR E → PR D: Correctness tests + comparison test
  • PR F → PR E: Refactoring + benchmarking binary

Test plan

  • Engine accepts Prometheus remote write traffic on configured port
  • Engine accepts VictoriaMetrics remote write traffic simultaneously
  • Pane-based sliding window produces correct aggregation results

🤖 Generated with Claude Code

@zzylol zzylol force-pushed the pr/B-e2e-tests-sliding-window-fix branch from bf87a83 to a49117e Compare March 25, 2026 11:32
@zzylol zzylol force-pushed the pr/C-multi-connector-pane-sliding-window branch from 07e69dc to 439dcf2 Compare March 25, 2026 11:32
zzylol and others added 6 commits March 31, 2026 14:32
Implements a single-node multi-threaded precompute engine as a new module
and binary target within QueryEngineRust. The engine ingests Prometheus
remote write samples, buffers them per-series with out-of-order handling,
detects closed tumbling/sliding windows via event-time watermarks, feeds
samples into accumulator wrappers for all existing sketch types, and emits
PrecomputedOutput directly to the store.

New modules: config, series_buffer, window_manager, accumulator_factory,
series_router, worker, output_sink, and the PrecomputeEngine orchestrator.
The binary supports embedded store + query HTTP server for single-process
deployment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handle top-level aggregation types (DatasketchesKLL, Sum, Min, Max,
etc.) directly in the factory match, fixing the fallback to Sum that
broke quantile queries. Also preserve the K parameter in
KllAccumulatorUpdater::reset() instead of hardcoding 200.

Add test_e2e_precompute binary that validates the full ingest ->
precompute -> store -> query pipeline end-to-end.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sends 10 series × 100 samples in a single HTTP request to the raw-mode
engine and verifies all 1000 samples land in the store. Prints client
RTT and per-series e2e_latency_us at debug level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sends 1000 requests × 10000 samples (50 distinct series) to the
raw-mode engine and polls until all samples are stored. Reports both
send throughput and e2e throughput including drain time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dows

Previously each sample was assigned to only one window via
window_start_for(), which is incorrect for sliding windows where
window_size > slide_interval. Added window_starts_containing() that
returns all window starts whose range covers the timestamp, and use
it in the worker aggregation loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zzylol zzylol force-pushed the pr/B-e2e-tests-sliding-window-fix branch from a49117e to 685655e Compare March 31, 2026 20:09
zzylol and others added 12 commits March 31, 2026 15:11
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements a single-node multi-threaded precompute engine as a new module
and binary target within QueryEngineRust. The engine ingests Prometheus
remote write samples, buffers them per-series with out-of-order handling,
detects closed tumbling/sliding windows via event-time watermarks, feeds
samples into accumulator wrappers for all existing sketch types, and emits
PrecomputedOutput directly to the store.

New modules: config, series_buffer, window_manager, accumulator_factory,
series_router, worker, output_sink, and the PrecomputeEngine orchestrator.
The binary supports embedded store + query HTTP server for single-process
deployment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handle top-level aggregation types (DatasketchesKLL, Sum, Min, Max,
etc.) directly in the factory match, fixing the fallback to Sum that
broke quantile queries. Also preserve the K parameter in
KllAccumulatorUpdater::reset() instead of hardcoding 200.

Add test_e2e_precompute binary that validates the full ingest ->
precompute -> store -> query pipeline end-to-end.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n doc

Fix ghost accumulator bug where late samples passing the allowed_lateness
check could create orphaned accumulators for already-closed windows. Add
configurable LateDataPolicy (Drop/ForwardToStore) to control behavior.
Drop prevents ghost windows; ForwardToStore emits mini-accumulators for
query-time merge. Also add precompute engine design document.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract format-agnostic routing logic into route_decoded_samples() helper
and register a VictoriaMetrics remote write endpoint at /api/v1/import
alongside the existing Prometheus endpoint at /api/v1/write.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Route each sample to exactly 1 pane (sub-window of size slide_interval)
instead of W overlapping window accumulators. When a window closes, merge
its constituent panes via AggregateCore::merge_with(). This reduces
per-sample accumulator updates from W to 1 for sliding windows.

- Add snapshot_accumulator() to AccumulatorUpdater trait (9 implementations)
- Add pane_start_for(), panes_for_window(), slide_interval_ms() to WindowManager
- Replace active_windows HashMap with active_panes BTreeMap in worker
- Rewrite sample routing and window close logic with pane merging
- Extract PrecomputeEngine to engine.rs, ingest handlers to ingest_handler.rs
- Update design doc with pane-based architecture

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zzylol zzylol force-pushed the pr/C-multi-connector-pane-sliding-window branch from 439dcf2 to e5b6abc Compare March 31, 2026 20:25
Base automatically changed from pr/B-e2e-tests-sliding-window-fix to main April 1, 2026 18:25
zzylol and others added 2 commits April 1, 2026 16:11
Resolve merge conflicts in precompute engine:
- mod.rs: keep modular engine.rs re-export over inlined engine code
- worker.rs: keep pane-based sample routing over window-based routing
- window_manager.rs: keep pane method tests added by this branch
- Cargo.toml: deduplicate [[bin]] entries
- Uncomment victoriametrics_remote_write decode function needed by multi-connector ingest
- Add missing late_data_policy field to PrecomputeEngineConfig in main.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@milindsrivastava1997 milindsrivastava1997 merged commit 5221041 into main Apr 6, 2026
4 checks passed
@milindsrivastava1997 milindsrivastava1997 deleted the pr/C-multi-connector-pane-sliding-window branch April 6, 2026 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants