Skip to content

fix(#82): rebuild-index 大資料 OOM — 精簡 delta cache + 惰性驅逐#87

Merged
lis186 merged 1 commit into
mainfrom
fix/82-rebuild-index-oom
Jun 18, 2026
Merged

fix(#82): rebuild-index 大資料 OOM — 精簡 delta cache + 惰性驅逐#87
lis186 merged 1 commit into
mainfrom
fix/82-rebuild-index-oom

Conversation

@lis186

@lis186 lis186 commented Jun 18, 2026

Copy link
Copy Markdown
Owner

摘要

修復 rebuild-index 在大型 logs 目錄(8.3GB / 65K 檔)OOM 的問題。根因是 reconstructReq 的 cache Map 把每個重建的完整 parsedBody(含 system prompt 50-100KB + tools 10-50KB)留在記憶體,累積到 4GB+ 觸發 V8 heap limit。

Changes

Two-layer fix:

  1. Stripped cache in reconstructReq: Cache stores only { parsedBody: { messages } } — the minimum needed for downstream delta splicing. Full parsedBody (system, tools, metadata) is returned to caller but NOT retained. This alone cuts cache memory ~80%.

  2. Lazy refcount eviction in Pass 2: After projecting each turn through buildEntryFields, entries with no remaining downstream consumers are deleted from cache. An ancestorRefs Map (built from recon entries' prevId fields) tracks reference counts. Memory bound: O(active chain depth × messages-only size).

What's NOT changed

  • reconstructReq interface is additive only (returns additional prevId field)
  • Two-pass structure preserved (reconstruction → projection)
  • All existing invariants maintained: merge-only, never-degrade, atomic write, hub-safe

Verification (per verification-principles.md)

  • Red → Green: Test cacheFinalSize === 0 fails on main (cache never evicted), passes on fix branch
  • Worktree差異檢查: Copied new test to main worktree, confirmed FAIL; confirmed PASS on fix branch
  • 893/893 full test suite passes, zero regression

Test plan

  • New test: 2-session × 3-turn delta chains, verifies correct splicing + cacheFinalSize === 0
  • All 10 rebuild-index tests pass
  • Full suite (893 tests) passes
  • Independent code review: no correctness bugs found
  • Manual verification on real 8.3GB logs directory (if available)

🤖 Generated with Claude Code

reconstructReq now caches only the messages array (needed for delta
splicing), not the full parsedBody (system 50-100KB + tools 10-50KB per
entry). After projection, entries are deleted from cache via lazy
reference counting — once an ancestor's last downstream consumer
finishes, the entry is evicted. Peak memory drops from O(total orphans ×
full body) to O(active chain depth × messages-only).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lis186 lis186 merged commit ec28baf into main Jun 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant