feat(moho): derive moho state in a dedicated subscriber worker#129
Conversation
49ac037 to
8a6d427
Compare
|
Commit: 8e96712
|
8a6d427 to
62febab
Compare
37d781e to
62febab
Compare
34e9c47 to
86a7de7
Compare
bd6ded2 to
f7dd4e4
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f7dd4e4e44
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
bcbae64 to
f75ef6c
Compare
|
@codex review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f75ef6c264
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
evgenyzdanovich
left a comment
There was a problem hiding this comment.
That's great to see, LGTM.
Let's rebase the #128 and progress on this one (resolving the rest of codex findings), so it can be merged soon
86a7de7 to
c19e042
Compare
Materializes per-block MohoState from the ASM worker's commit stream as a deterministic forward-only fold: for each committed block it reads the anchor state and logs the ASM worker already persisted and derives the Moho state from them, persisting through a caller-supplied context. The context splits reads from writes — AsmStateProvider fetches the anchor state and its logs (committed atomically per block, hence the shared MissingAsmState miss), MohoStateStore loads and persists the derived state — mirroring how strata-asm-worker takes a WorkerContext.
The fold assumed each commit was the immediate height-successor of the last one processed and chained off an in-memory cur_moho. That breaks under an L1 reorg: a commit building on an earlier fork point isn't the height-successor of the previously folded block, so it was wrongly dropped as stale or rejected as a gap. Instead resolve the commit's actual parent (new L1ProviderContext) and fold from the parent's committed Moho state (MohoStateStore::get_moho_state), so the worker chains along real ancestry across reorgs. NonContiguousBlock gives way to MissingMohoState (the residual missing-parent gap) and MissingParentBlock.
…o the service Mirror strata-asm-worker's shape: keep the current MohoState in the service state and move the fold orchestration into a process_block free function in the service layer, leaving the state as data plus a small update_moho_state mutation. The in-memory state is load-bearing again — the common in-order commit folds straight from it with no store read — while reorg-safety is kept by re-anchoring from the parent's committed state when the incoming commit doesn't build on the block held in memory. Surface the current MohoState in the worker status (via moho-types' serde feature), matching AsmWorkerStatus, and drop the launch-relative processed counter, which reset on restart and added nothing over cur_block.
Per-block MohoState persistence lived in strata-asm-proof-db next to proof artifacts, conflating materialized state with proofs. Give it its own home so the Moho worker can own its store independently of the proof DB. Adds get_latest so the worker can resume from its latest committed state across restarts.
The ASM worker materialized MohoState inline on every anchor-state write (the AsmWorkerContext piggyback), tying Moho to the ASM worker's hot path. Spin the subscription-driven MohoWorker off onto its own service task instead: it folds each ASM commit onto its resolved parent and persists the derived state, following L1 reorgs without sitting in the ASM worker's write path. Persistence moves to the dedicated strata-asm-moho-storage store; the RPC server and proof orchestrator read from it unchanged.
…orker MohoState persistence moved to the dedicated strata-asm-moho-storage crate, so strata-asm-proof-db no longer needs its own copy. Remove the MohoStateDb trait and SledMohoStateDb (the moho-proof code stays) along with the deps they alone pulled in. Both stores key the same "moho_states" tree with identical encoding, so existing databases carry over unchanged.
The export-entries index mirrors the leaves of each MohoState's ExportState MMR: both derive from the same NewExportEntry logs and must advance from the same iteration, or the per-container indices drift from the MMR that commits to them and inclusion proofs break. MohoState derivation moved to the Moho worker, but the index was still written by AsmWorkerContext on the ASM worker — splitting one unit across two tasks. Move ExportEntriesDb from asm-storage into strata-asm-moho-storage and the indexing into the worker's per-block fold via a new ExportEntryStore concern, written before the Moho-state commit point so a reprocessed block re-appends idempotently. AsmWorkerContext::store_anchor_state now just writes the anchor.
The export-entries store landed as a single sled-specific file, unlike the moho-state store which separates a backend-agnostic trait from its sled implementation. Mirror that layout: lift an async `ExportEntriesDb` trait to the top level and move the concrete store (renamed `SledExportEntriesDb`) under `sled/`. The sled type keeps its synchronous inherent methods for the worker/RPC call sites and implements the async trait for symmetry with `MohoStateDb`. Consumers in asm-runner are renamed to the concrete type.
A block can emit multiple ExportEntry leaves, so the append path can't reuse the manifest MMR's reorg strategy. Record the follow-up (STR-3723) at the append site rather than leaving the gap undocumented.
`Self` does not resolve in a module-level doc comment, so the rustdoc intra-doc link failed under `-D warnings` and broke the Generate docs CI job. Reference the concrete `MohoWorkerContextImpl` type instead.
The ASM worker's plan_block_processing hand-rolls a backward walk from a target tip to the most recent ancestor with stored state, collecting the blocks in between. The Moho worker needs the identical walk for its startup catch-up — only the forward step (run STF vs fold MohoState) differs — so the search belongs in one tested place rather than copied. Move it to a generic plan_sync over closures for the base lookup and parent resolution, exported from strata-asm-worker, and route plan_block_processing through it. The base-lookup closure also stops swallowing every storage error as "not a base": only MissingAsmState means keep walking, so a real DbError now propagates instead of being mistaken for an unprocessed block.
The ASM worker commits a block's anchor state before the Moho worker folds it, and the commit subscription has no replay. So a crash in that window — or any downtime while the ASM worker keeps committing — leaves the Moho store trailing the ASM store, and the next live commit folds onto a parent whose Moho state was never derived, wedging the worker on MissingMohoState. Add sync_to_tip, run once before the subscription is consumed: it walks real parents back from the ASM tip (get_latest_asm_block) to the last folded block and folds the gap forward, so the steady-state handler can keep assuming its parent is present. Routed through strata-asm-worker's plan_sync, with genesis as the floor. Walking real parents rather than heights keeps it correct across an L1 reorg during downtime.
f75ef6c to
1beadbd
Compare
Stacked on #128 (the per-block subscription stream) — that PR is the base and should be reviewed/merged first. The diff here is against the subscription branch.
Description
Moho-state derivation lived inside
AsmWorkerContext::store_anchor_state, on the ASM worker's hot path, and reported its failures asWorkerError::DbError— indistinguishable from a real anchor-state failure (STR-3124). This moves it into a dedicatedstrata-asm-moho-workercrate: a subscription-driven service that subscribes to the ASM worker's per-block commit stream, folds each commit onto its resolved parent'sMohoState, and persists it — on its own service task, with its own error type.It follows L1 reorgs by resolving each commit's real parent rather than assuming the commit is the height-successor of the last one processed, then chaining the parent's
MohoStateforward.AnchorStatedoes not carry the parent id, so the runner's context resolves it from the parent block header over the Bitcoin client;The runner spins the worker off onto its own service task, only when the orchestrator is configured (it needs the ASM predicate and the moho-state store). It subscribes before the block watcher starts — the subscription has no replay — and seeds the genesis
MohoStatefrom the ASM genesis anchor the worker already committed at launch.Persistence moves out of
strata-asm-proof-dbinto a dedicatedstrata-asm-moho-storagecrate; the RPC server and the proof orchestrator read from it unchanged (both stores key the samemoho_statestree with identical encoding, so existing databases carry over). AddsSledMohoStateDb::get_latest(), the worker's durable startup progress marker.AsmWorkerContextno longer derives Moho state.store_anchor_statestill indexes theNewExportEntrylogs into the export-entries store — that index is unchanged and stays on the ASM worker; only the Moho-state computation moved out.Type of Change
Notes to Reviewers
Derived Moho state is now eventually-consistent — advanced on the moho worker's task rather than committed synchronously with the anchor. The RPC's inclusion-proof path already depends on moho-worker progress (it reads
moho_state_db) and returnsNonefor a block whose Moho state has not yet been folded.Known limitation: if the process crashes between the anchor commit for block N and the moho fold for N, on restart the worker is at N-1 while the next emit is N+1 — a gap it currently treats as fail-fast. Self-healing needs height-indexed backfill from
AsmStateDb, captured asTODO(STR-3124).Checklist
Related Issues
Jira STR-3698. Stacked on #128.