Replace SimpleStore with inverted index (label -> BTreeMap<Time>)#175
Merged
milindsrivastava1997 merged 27 commits intomainfrom Mar 27, 2026
Conversation
Contributor
milindsrivastava1997
left a comment
There was a problem hiding this comment.
- Are there correctness tests to ensure that the new store functions same as the
per_keystore? - Why is the global store changed to have per key behavior?
- I didn't see "before" benchmark numbers in the PR.
milindsrivastava1997
requested changes
Mar 16, 2026
Contributor
milindsrivastava1997
left a comment
There was a problem hiding this comment.
Blocking on earlier comments
zzylol
added a commit
that referenced
this pull request
Mar 20, 2026
Adds six benchmark groups to measure the existing SimpleMapStore's algorithm complexity before the inverted-index replacement in PR #175: - insert/batch_size: O(B) insert scaling across 10–5000 items - insert/num_agg_ids: lock overhead across 1–200 aggregation IDs - query/range_store_size: O(W·log W + k) range query across 100–5000 windows - query/exact_store_size: O(1) HashMap lookup verified across store sizes - store_analyze/num_agg_ids: O(A) earliest-timestamp scan across 10–1000 IDs - concurrent_reads/thread_count: write-lock serialisation with 1–8 threads Both Global and PerKey lock strategies are profiled in each group. Results land in target/criterion/ as HTML reports. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merged
2 tasks
Contributor
Author
|
@milindsrivastava1997 Added a separate store benchmark first: #211 |
2 tasks
zzylol
added a commit
that referenced
this pull request
Mar 20, 2026
…ry, and store analyze (#211) * Add Criterion benchmarks for SimpleMapStore performance profiling Adds six benchmark groups to measure the existing SimpleMapStore's algorithm complexity before the inverted-index replacement in PR #175: - insert/batch_size: O(B) insert scaling across 10–5000 items - insert/num_agg_ids: lock overhead across 1–200 aggregation IDs - query/range_store_size: O(W·log W + k) range query across 100–5000 windows - query/exact_store_size: O(1) HashMap lookup verified across store sizes - store_analyze/num_agg_ids: O(A) earliest-timestamp scan across 10–1000 IDs - concurrent_reads/thread_count: write-lock serialisation with 1–8 threads Both Global and PerKey lock strategies are profiled in each group. Results land in target/criterion/ as HTML reports. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix cargo fmt violations in bench and add dummy bench to Dockerfile - Apply rustfmt to simple_store_bench.rs (alignment, closure brace style) - Add dummy benches/simple_store_bench.rs stub to Dockerfile dep-cache layer so cargo can parse the [[bench]] manifest entry during docker build Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Expand benchmarks: larger ranges, KLL sketch accumulator - Increase parameter ranges (batch: 100→50k, windows: 500→50k, agg IDs: 1→1k/5k, threads: 1→16) - Add DatasketchesKLLAccumulator k=200 variant to insert and range query benchmarks to expose realistic sketch clone cost - KLL results show ~10x overhead on range queries vs trivial SumAccumulator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix asap-planner-rs Dockerfile: add dummy bench stub for simple_store_bench Same fix as asap-query-engine/Dockerfile — the workspace Cargo.toml for asap-query-engine declares [[bench]] simple_store_bench, so Cargo requires the file to exist even during dependency-only builds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Author
|
added correctness tests to ensure that the new store functions same as the per_key store #212 |
8a1cd59 to
ed49375
Compare
Contributor
Author
Now I moved original global to legacy/global.rs, and the current global.rs is for the modified simple store. |
5e666d0 to
3b26f8d
Compare
Contributor
Author
|
@milindsrivastava1997 , passed the API correctness test of simple store. |
Contributor
milindsrivastava1997
left a comment
There was a problem hiding this comment.
Not sure why renames aren't showing up. Maybe split this PR into (a) renames + corresponding changes to simple_store_bench, (b) other changes?
zzylol
pushed a commit
that referenced
this pull request
Mar 23, 2026
Move `SimpleMapStoreGlobal` → `LegacySimpleMapStoreGlobal` and `SimpleMapStorePerKey` → `LegacySimpleMapStorePerKey` under a new `simple_map_store/legacy/` submodule, in preparation for introducing optimised replacements that will reclaim the original names (PR #175 part b). - `legacy/global.rs` / `legacy/per_key.rs`: original implementations, renamed with the `Legacy` prefix throughout (struct, impl, log messages) - `legacy/mod.rs`: re-exports both legacy types - `simple_map_store/mod.rs`: references legacy module; `SimpleMapStore` enum now wraps `LegacySimpleMapStoreGlobal` / `LegacySimpleMapStorePerKey` - `benches/simple_store_bench.rs`: doc comment updated to reflect that the bench profiles the legacy store implementation Public API (`SimpleMapStore`, `Store`) is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 tasks
milindsrivastava1997
pushed a commit
that referenced
this pull request
Mar 23, 2026
…220) Move `SimpleMapStoreGlobal` → `LegacySimpleMapStoreGlobal` and `SimpleMapStorePerKey` → `LegacySimpleMapStorePerKey` under a new `simple_map_store/legacy/` submodule, in preparation for introducing optimised replacements that will reclaim the original names (PR #175 part b). - `legacy/global.rs` / `legacy/per_key.rs`: original implementations, renamed with the `Legacy` prefix throughout (struct, impl, log messages) - `legacy/mod.rs`: re-exports both legacy types - `simple_map_store/mod.rs`: references legacy module; `SimpleMapStore` enum now wraps `LegacySimpleMapStoreGlobal` / `LegacySimpleMapStorePerKey` - `benches/simple_store_bench.rs`: doc comment updated to reflect that the bench profiles the legacy store implementation Public API (`SimpleMapStore`, `Store`) is unchanged. Co-authored-by: zz_y <zz_y@node0.zz-y-296227.softmeasure-pg0.wisc.cloudlab.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1b14f94 to
da0390a
Compare
da0390a to
02d2147
Compare
zzylol
added a commit
that referenced
this pull request
Mar 24, 2026
#221) * Rename SimpleMapStore impls to Legacy* and move to legacy/ submodule (#220) Move `SimpleMapStoreGlobal` → `LegacySimpleMapStoreGlobal` and `SimpleMapStorePerKey` → `LegacySimpleMapStorePerKey` under a new `simple_map_store/legacy/` submodule, in preparation for introducing optimised replacements that will reclaim the original names (PR #175 part b). - `legacy/global.rs` / `legacy/per_key.rs`: original implementations, renamed with the `Legacy` prefix throughout (struct, impl, log messages) - `legacy/mod.rs`: re-exports both legacy types - `simple_map_store/mod.rs`: references legacy module; `SimpleMapStore` enum now wraps `LegacySimpleMapStoreGlobal` / `LegacySimpleMapStorePerKey` - `benches/simple_store_bench.rs`: doc comment updated to reflect that the bench profiles the legacy store implementation Public API (`SimpleMapStore`, `Store`) is unchanged. Co-authored-by: zz_y <zz_y@node0.zz-y-296227.softmeasure-pg0.wisc.cloudlab.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix cargo fmt formatting in simple_map_store/mod.rs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: zz_y <zz_y@node0.zz-y-296227.softmeasure-pg0.wisc.cloudlab.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
zzylol
added a commit
that referenced
this pull request
Mar 24, 2026
…220) Move `SimpleMapStoreGlobal` → `LegacySimpleMapStoreGlobal` and `SimpleMapStorePerKey` → `LegacySimpleMapStorePerKey` under a new `simple_map_store/legacy/` submodule, in preparation for introducing optimised replacements that will reclaim the original names (PR #175 part b). - `legacy/global.rs` / `legacy/per_key.rs`: original implementations, renamed with the `Legacy` prefix throughout (struct, impl, log messages) - `legacy/mod.rs`: re-exports both legacy types - `simple_map_store/mod.rs`: references legacy module; `SimpleMapStore` enum now wraps `LegacySimpleMapStoreGlobal` / `LegacySimpleMapStorePerKey` - `benches/simple_store_bench.rs`: doc comment updated to reflect that the bench profiles the legacy store implementation Public API (`SimpleMapStore`, `Store`) is unchanged. Co-authored-by: zz_y <zz_y@node0.zz-y-296227.softmeasure-pg0.wisc.cloudlab.us> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
02d2147 to
7032bfd
Compare
Defines a run_contract_suite() function that tests every observable
behaviour of a Store implementation:
- Empty-store edge cases (range query, exact query, earliest timestamp)
- Single insert: range query hit/miss, exact query hit/miss (wrong start,
wrong end)
- Batch insert: count correctness, chronological ordering guaranteed
- Partial range filtering (windows outside query range excluded)
- Aggregation-ID isolation (inserts into agg 1 not visible to agg 2)
- Earliest-timestamp tracking: global minimum, per agg-ID
- Cleanup — CircularBuffer: oldest window evicted, newest 8 retained
- Cleanup — ReadBased: evicted after threshold reads, unread window kept
- Concurrency: 8-thread concurrent inserts (no data loss),
8-thread concurrent reads (each returns full result set)
Two test entry points exercise both existing implementations:
contract_per_key — LockStrategy::PerKey (reference)
contract_global — LockStrategy::Global
Adding a new Store implementation requires only a new #[test] function
that calls run_contract_suite() with the new factory.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ltaSet exclusion - Add SerializableToSink import so clone-fidelity tests compile on concrete types - Clone fidelity tests for all 11 accumulator types: SumAccumulator, MinMaxAccumulator, DatasketchesKLLAccumulator, IncreaseAccumulator, MultipleSumAccumulator, MultipleMinMaxAccumulator, SetAggregatorAccumulator, DeltaSetAggregatorAccumulator, CountMinSketchAccumulator, CountMinSketchWithHeapAccumulator, HydraKllSketchAccumulator - Three keyed-entry tests: grouping by key, coexistence of keyed/unkeyed, multiple keys per window - DeltaSetAggregator cleanup exclusion test - Concurrency tests: concurrent inserts (8 threads) and concurrent reads Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add separate time and space complexity tables with named variables (A, L, N, k, m, V), per-operation breakdowns, and a note on Arc-sharing eliminating deep copies on the read path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. Label interning (MetricID: u32) Introduce InternTable in common.rs that assigns a compact u32 to each unique Option<KeyByLabelValues> on first insert. All internal maps (label_map, window_to_ids) use MetricID as key — O(1) hash/compare instead of O(label_bytes). Label strings stored once; resolved back to KeyByLabelValues only when building the returned TimestampedBucketsMap. 2. Time-epoch partitioning + O(1) rotation cleanup Replace the single BTreeMap-per-label with epoch-partitioned storage: BTreeMap<EpochID, EpochData>. epoch_capacity = num_aggregates_to_retain (set on first insert). When the current epoch reaches capacity, a new epoch is opened and the oldest is dropped — O(1) drop vs the previous O(k·m) BTreeSet walk + targeted BTreeMap removals. ReadBased cleanup scans read_counts then calls EpochData::remove_windows on each epoch. 3. Sorted Vec posting lists Replace HashSet<Option<KeyByLabelValues>> in window_to_labels with Vec<MetricID> maintained in sorted order via partition_point + insert. Cache-friendly iteration for exact queries; enables merge-intersection for future label-predicate pushdown. Both per_key.rs and global.rs updated. Shared types extracted to common.rs. Public Store trait interface and all 329 existing tests are unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add dummy benches/simple_map_store_benchmark.rs in Dockerfile dep-caching layer so cargo can parse the manifest (bench entry in Cargo.toml) - Run cargo fmt to fix import ordering and line-wrapping in common.rs, global.rs, and per_key.rs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Save old time-primary SimpleMapStorePerKey as LegacySimpleMapStorePerKey (deprecated, kept only for benchmarking) - Expose it as pub mod per_key_legacy in the store module - Add old_vs_new/* benchmark group comparing legacy vs current on insert, range_query, exact_query, concurrent_reads, and scaling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cated - Add MetricBucketMap type alias in common.rs to fix type_complexity - Use MetricBucketMap in global.rs and per_key.rs - Remove unused `mut` from `results` in global.rs - Collapse nested `if` into single condition in per_key.rs - Add #[allow(deprecated)] to impl blocks in per_key_legacy.rs - Add #![allow(deprecated)] to benchmark file for legacy store usage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…chs, O(1) flat storage
1. Time-primary range scan (common.rs MutableEpoch):
- window_to_ids: HashMap → BTreeMap, enabling O(log N + actual_matches) range scan
- Range queries no longer visit every label; only labels with data in matched windows
- For high-cardinality sparse data: O(L·log N) → O(log N + actual_matches)
2. Immutable sealed epochs (common.rs SealedEpoch):
- On epoch rotation, MutableEpoch is sealed into a flat sorted Vec<(TimestampRange, MetricID, Arc<Agg>)>
- Sorted by (TimestampRange, MetricID): windows contiguous, cache-friendly linear scan
- Range queries use partition_point (binary search) then linear scan — no pointer chasing
- min_tr/max_tr precomputed for O(1) epoch-skip check
3. Flat primary storage (MutableEpoch):
- label_map: HashMap<MetricID, BTreeMap<TimestampRange, Vec<Arc<Agg>>>> replaced by
data: HashMap<(MetricID, TimestampRange), Vec<Arc<Agg>>>
- O(1) insert and point lookup; no nested BTreeMap traversal
per_key.rs / global.rs updated:
- StoreKeyData/PerKeyState now hold current_epoch: MutableEpoch + sealed_epochs: BTreeMap<EpochID, SealedEpoch>
- maybe_rotate_epoch calls MutableEpoch::seal() and inserts into sealed_epochs
- cleanup_read_based updated for current + sealed epoch split
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MutableEpoch is now a plain append-only Vec (no BTreeMap, no sorted Vec maintenance during writes). All sorting is deferred to seal() — paid once at epoch rotation, not on every insert. Mirrors VictoriaMetrics' rawRows → in-memory part pipeline. Insert path: Vec::push + HashSet::insert + 2 scalar min/max updates = O(1). seal() path: sort_unstable_by_key = O(M log M), called once per epoch. Query on active epoch: linear scan O(M), bounded by epoch_capacity × L. Acceptable since sealed epochs hold most historical data and use binary search. Replace min_tr/max_tr (TimestampRange) with min_start/max_end (u64) in both MutableEpoch and SealedEpoch for accurate epoch-skip bounds. Callers updated to use time_bounds() -> Option<(u64, u64)>. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MutableEpoch now maintains a secondary index: window_to_ids: HashMap<TimestampRange, Vec<(MetricID, Arc<Agg>)>> On insert: Arc::clone into window_to_ids (refcount bump only, no data copy) + Vec::push — O(1) amortized, no change to insert complexity. exact_query: HashMap lookup O(1) + iterate m entries O(m), no raw scan. Previously O(M) linear scan of all raw entries. range_query_into: unchanged linear scan of raw (O(M), bounded). seal(): unchanged — window_to_ids dropped, raw is sorted in-place. remove_windows: also removes from window_to_ids. Memory: one extra Arc pointer (8 bytes) per inserted entry — cheap. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndex, batch hoisting) - Columnar storage: three parallel arrays (windows_col, metric_ids_col, aggregates_col) instead of row tuples; range scan hot loop touches only windows_col - Lazy window_to_ids index: built on first exact_query after a write batch, invalidated cheaply (= None) on insert — zero index maintenance on hot path - Monotonic ingest fast path: skip windows_set HashSet probe for consecutive same-window inserts (Opt 3) - with_capacity hint: MutableEpoch::with_capacity uses previous epoch len to avoid Vec reallocation during next epoch fill - Batch metadata hoisting in global/per_key: config lookup, metrics.insert, earliest_ts update, items_inserted count moved from per-item to per-group Results: ~2x insert, ~3x range query (Sum), ~27x range query (KLL) vs legacy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…h design Replace the old EpochData/BTreeMap description with the current implementation: - MutableEpoch: columnar storage (3 parallel arrays), lazy window_to_ids index, monotonic ingest fast path, incremental bounds tracking - SealedEpoch: flat sorted Vec with binary-search range/exact queries - Updated complexity table (insert O(1) amortized, range O(log N + k) on sealed) - Updated query mechanics to reflect write-lock for exact query (lazy index build) - Remove asap-query-engine/docs/simple_store_insert_profile.md (profiling scratch doc) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The epoch-based rotation was dropping an entire epoch (epoch_capacity windows) when the retention limit was exceeded. Legacy evicts exactly (total - retention_limit) windows, which can be fewer than one full epoch. Fix: after sealing, compute total distinct windows across all epochs in O(E) using SealedEpoch::distinct_window_count() (precomputed at seal time, updated in remove_windows). Evict the minimum number of oldest windows from the oldest sealed epoch(s) needed to reach retention_limit, dropping the epoch only when fully emptied. Also adds the eviction check to the non-rotation path so the 9th window (still in the current partial epoch) triggers eviction of the 1st window. Fixes: contract_global and contract_per_key (both now pass). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These files were modified during early development on this branch but those changes were superseded by the new global.rs/per_key.rs stores. Reset to simple_store_opt versions so they don't appear as unintended changes in the PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7032bfd to
8e0b4e3
Compare
…d ordering Legacy global/per_key stores were returning Box from clone_boxed_core() but TimestampedBucket now expects Arc. Add .into() at the 4 call sites. Also reorder mod.rs to put `mod common` before `pub mod legacy` for cargo fmt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…chable Without pub mod global and pub mod per_key in mod.rs, the compiler sees all types in common.rs as dead code and fails under -D warnings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
milindsrivastava1997
approved these changes
Mar 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
SimpleMapStoreused a time-primary layout that caused:O(N log N + k)clone_boxed_core()) on every readSolution
Three structural changes inspired by VictoriaMetrics' IndexDB:
MetricID: u32. All internal maps key onu32instead of the full label struct.O(1)(drop oldest); range queries skip non-overlapping epochs entirely.windows_col,metric_ids_col,aggregates_col) instead of row tuples. Range scans touch onlywindows_col; aggregate pointers are only chased on matches. The exact-lookup index (window_to_ids) is built lazily on first query after a write batch and cached — zero index maintenance on the insert hot path.Complexity changes
O(1)O(1)amortized (lazy index + monotonic fast path)O(N log N + k)O(log N + k)(binary search on sealed epoch; linear scan on mutable)O(k)deep copiesO(m)Arc::clone()O(N log N)O(1)Changes
common.rs—InternTable,MutableEpoch(columnar + lazy index),SealedEpoch(sorted Vec)per_key.rs— epoch rotation withwith_capacityhint, batch metadata hoisting,write()lock for exact queryglobal.rs— same epoch/interning model; singleMutex<StoreData>; pre-grouped batch insertstraits.rs/simple_engine.rs/conversion.rs—Arcbuckets, removeclone_boxed_core()on read pathsbenches/simple_store_bench.rs, Criterion wiring inCargo.tomlBenchmark Results
Linux x86-64, release build, Criterion. Medians; lower is better.
Insert —
insert/batch_size(Sum, NoCleanup)Range query —
query/range_store_sizeKLL benefit is outsized because the columnar scan never touches aggregate pointers for non-matching windows, eliminating KLL deserialization for the ~95–99 % of entries that don't match a typical range.
Exact query —
query/exact_store_size~330 ns on both legacy and new across all store sizes — parity.