fix(tui): cap live step trace to prevent OOM on long sprint runs#104
Merged
Conversation
- Cap `steps` state in execute-view at 200 entries (FIFO eviction). - Slice StepTrace render to last 50 with an elision row for the rest. - Add regression tests covering both the cap and the under-cap path. Long `sprint start` runs (200+ tasks × ~10 leaves each) grew the live step list into the thousands. Combined with 80-120ms spinner heartbeats, Ink's per-render `[...childNodes].reverse()` allocated a fresh array per tick and OOMed Node after ~15h.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
stepsstate inexecute-view.tsxat 200 entries (FIFO eviction).StepTracerender to last 50 with an elision row above for the rest.Why
After ~15h of
ralphctl sprint startNode V8 hitIneffective mark-compacts near heap limit Allocation failedand aborted at ~4 GB. The native stack trace's smoking gun wasBuiltins_ArrayPrototypeReversefired fromuv__run_check→Environment::CheckImmediate— the only.reverse()reachable from React's commit phase is Ink's[...node.childNodes].reverse()inrender-node-to-output.js:42.Tracing the React tree pointed at one consumer:
execute-view.tsx'sstepsstate subscriber appended everystepevent without a cap, andStepTracerenderedsteps.map(...)without a slice. A 200-task sprint × ~10 inner leaves per task yielded thousands of entries; combined with 80–120 ms spinner heartbeats inspinner.tsx/header-heartbeat.tsx/task-execution-list.tsx/step-trace.tsx, Ink reconciled and re-reverse()-ed the giant child list every tick.Capping at the consumer (not in
kernel/) preserves the canonical chain trace asserted by every<flow>-flow.test.tsstep-order test — the leak is renderer allocation churn, not data retention.Test plan
pnpm typecheckpnpm lintpnpm test— 232 files / 2240 tests passstep-trace.test.tsx: 5000-entry cap and 10-entry under-cap pathpnpm dev sprint startagainst a long sprint, watch RSS plateau instead of growing