Skip to content

Replace segcache hashtable with lock-free N-choice implementation#10

Merged
thinkingfish merged 7 commits intopelikan-io:mainfrom
brayniac:brayniac/replace-segcache-hashtable
Apr 16, 2026
Merged

Replace segcache hashtable with lock-free N-choice implementation#10
thinkingfish merged 7 commits intopelikan-io:mainfrom
brayniac:brayniac/replace-segcache-hashtable

Conversation

@brayniac
Copy link
Copy Markdown
Contributor

@brayniac brayniac commented Apr 16, 2026

Summary

  • Replace single-choice chaining hashtable with a new lock-free N-choice implementation
  • SIMD-accelerated tag scanning (AVX2 on x86_64, NEON on aarch64, scalar fallback)
  • Lock-free AtomicU64 CAS per slot — preparation for dropping &mut self requirement
  • N-choice hashing (2-choice default) with 8 item slots per bucket replaces overflow chain buckets
  • Bucket and segment data prefetch for memory latency hiding
  • Storage-agnostic KeyVerifier trait decouples hashtable from segment internals
  • In-table ghost entry support via Location::GHOST sentinel

The hashtable module files (hashtable/location.rs, bucket.rs, table.rs, traits.rs, mod.rs, plus sync.rs) are entirely new code. Remaining files (segments/, ttl_buckets/, orchestration) have minimal call-site changes and will be fully replaced in subsequent phases.

Test plan

  • cargo build --workspace --all-features
  • cargo test --workspace — 88 tests pass
  • cargo test -p segcache --features debug — debug feature tests pass
  • cargo clippy -p segcache --all-features -- -D warnings — clean
  • cargo fmt --all --check — clean

🤖 Generated with Claude Code

brayniac and others added 2 commits April 16, 2026 09:50
Replace the Twitter-copyrighted single-choice chaining hashtable with a
new lock-free implementation based on crucible's MultiChoiceHashtable.

Key changes:
- N-choice hashing (2-choice default) replaces overflow chain buckets
- All 8 slots per bucket are item slots (no metadata slot), enabling SIMD
- SIMD tag scanning on AVX2 (x86_64) and NEON (aarch64) with scalar fallback
- Lock-free AtomicU64 CAS operations on every slot (prep for dropping &mut self)
- Bucket and segment data prefetch for latency hiding
- In-table ghost entry support via Location::GHOST sentinel
- Storage-agnostic design via KeyVerifier trait + opaque Location type
- SegmentsVerifier adapter bridges existing Segments to new KeyVerifier

This is Phase 1 of a multi-phase effort to replace all Twitter-copyrighted
code with clean implementations. The hashtable module files are entirely
new code. Files in segments/, ttl_buckets/, and the orchestration layer
have minimal signature changes to work with the new hashtable API and
will be fully replaced in subsequent phases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The repo-level LICENSE-MIT and LICENSE-APACHE files cover all code.
Per-file headers are unnecessary for new files that have no Twitter
copyright to preserve.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@brayniac brayniac marked this pull request as draft April 16, 2026 16:57
brayniac and others added 4 commits April 16, 2026 10:00
Add optional `loom` feature with 8 model-checked concurrency tests:
- Concurrent insert of different keys
- Concurrent insert of same key (upsert)
- Concurrent lookup with frequency updates
- Concurrent insert and remove
- Concurrent CAS operations (2-way and 3-way)
- Direct slot CAS contention
- Three-way insert of different keys

The sync module conditionally uses loom atomics when the feature is
enabled, and SIMD tag scanning falls back to scalar under loom since
loom replaces atomic operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@brayniac brayniac marked this pull request as ready for review April 16, 2026 17:12
The builder's hash_power(N) now means total capacity is 2^N item slots,
matching the original API semantics. The MultiChoiceHashtable divides
by 8 internally to get the bucket count. Minimum power is 7 (128 slots).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@brayniac brayniac requested a review from thinkingfish April 16, 2026 17:26
@thinkingfish thinkingfish merged commit 7f85e0b into pelikan-io:main Apr 16, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants