perf: cache DAG, batch top-k metadata, add full-pipeline recall benches#8
perf: cache DAG, batch top-k metadata, add full-pipeline recall benches#8JosephOIbrahim wants to merge 1 commit into
Conversation
Three perf / measurement improvements from the second-pass MoE review. None overlap with PR #6 or PR #7. * src/mock_cogexec.py — cache the cognitive DAG at module load. The DAG shape is a constant (7 nodes, 5 edges); previously evaluate_dag called build_dag() + topological_sort() on every exchange. A typical session evaluates this on every exchange (≥ hundreds per session); module-level cache turns ~N rebuilds into 1. * python/harlo/encoder/__init__.py — batch the top-k metadata fetch in semantic_recall. Previously: load all candidate SDRs once (correct), then for EACH top-k id, execute a separate `SELECT ... WHERE id = ?` + fetchone(). Now: one `SELECT ... WHERE id IN (?,?,...)` builds a {id: row} map; the ordering loop reads from the map. k+1 round-trips → 2 statements. * crates/hippocampus/benches/recall.rs — add three full-pipeline benches alongside the existing xor_search micro-benches: - `load_all_sdrs 100k` — SQLite scan + blob read - `recall_full 10k k=5` — end-to-end load + search + decay + result - `recall_full 100k k=5` — same at the constitutional 100K size The existing xor_search bench measures only popcount + heap over pre-loaded candidates; if SQLite I/O dominates the real recall path, optimising popcount with SIMD is wasted work. Measure before optimising. Verified: - cargo test -p hippocampus — 42 passed - cargo bench --no-run -p hippocampus — bench compiles clean - 207 adjacent tests pass (motor/brainstem/elenchus/hippocampus/ composition/inquiry/sync/hot_store); 7 skipped on pre-existing env issues (pxr/mcp not installed in sandbox) - Encoder tests env-blocked by missing numpy; encoder/__init__.py syntax-clean via py_compile - Constitutional greps still zero https://claude.ai/code/session_017arHKzx5mTUFiry7JhhRPs
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThree independent performance improvements: expanded end-to-end benchmarking for recall operations in Rust, batched SQLite metadata queries in semantic search, and precomputed DAG caching in cognitive execution evaluation. ChangesBenchmark Suite Expansion
Trace Query Batching
DAG Evaluation Caching
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Three performance / measurement improvements from the second-pass MoE review (the "PR C" lane from the strategic split). Independent of PR #6 and PR #7 — no file overlap.
src/mock_cogexec.pyevaluate_dagreads from_DAG/_DAG_ORDERinstead of rebuildingpython/harlo/encoder/__init__.pySELECT ... WHERE id = ?+fetchone()loop insemantic_recallwith a singleSELECT ... WHERE id IN (?,?,...); build{id: row}map and read in distance ordercrates/hippocampus/benches/recall.rsxor_searchmicro-bench:load_all_sdrs 100k,recall_full 10k k=5,recall_full 100k k=5What's intentionally NOT in this PR
_CONSENT_SECRETregen). Each requires a constitutional-intent decision before any code change._engine_locktimeout, HotStore commit) or PR fix(salvage): apoptosis transaction + decompile listener-before-reset + audit regression test #7 (apoptosis transaction, decompile listener reorder).Test plan
cargo test -p hippocampus— 42 passed, 0 failedcargo bench --no-run -p hippocampus— new benches compile cleanpython3 -m py_compileonsrc/mock_cogexec.pyandpython/harlo/encoder/__init__.py— syntax cleantests/test_motor/ test_brainstem/ test_elenchus/ test_hippocampus/ test_composition/ test_inquiry/ test_sync/ test_hot_store/) — 207 passed, 7 skipped (pre-existing env skips forpxr/mcp), 0 failedpython/harlo/encoder/semantic_encoder.py:12importsnumpywhich isn't installed. The N+1 fix is syntax-clean and behaviourally equivalent (same ordering, same result keys, fewer round-trips); local CI with numpy can confirm.cargo bench -p hippocampus --bench recall -- --warm-up-time 1 --measurement-time 3 --sample-size 20will produce numbers forload_all_sdrs 100kandrecall_full 100k k=5— those are the headline diagnostics this PR exists to enable.Compliance
The 33 inviolable rules in CLAUDE.md are unchanged. R1 / R7 are pure perf; R8 is pure additive bench. Constitutional greps (
sleep(,while True,DELETE.*audit,float32,cosine,reasoning_traceinelenchus/verifier.py) all return zero.https://claude.ai/code/session_017arHKzx5mTUFiry7JhhRPs
Generated by Claude Code
Summary by CodeRabbit
Tests
Refactor