Conversation
…ferred-block-proving
| let block_nums: Vec<i64> = | ||
| SelectDsl::select(schema::block_headers::table, schema::block_headers::block_num) | ||
| .filter(schema::block_headers::block_proof.is_null()) | ||
| .filter(schema::block_headers::block_num.gt(0i64)) |
There was a problem hiding this comment.
Q; are there proven blocks at genesis?
There was a problem hiding this comment.
Genesis block is never proven.
There was a problem hiding this comment.
I had to think about this; but I think it follows from the fact that the genesis block is probably invalid. It has no transactions; we just magically make state appear in it.
| // ================================================================================================ | ||
|
|
||
| /// Overall timeout for proving a single block. | ||
| const BLOCK_PROVE_TIMEOUT: Duration = Duration::from_mins(2); |
There was a problem hiding this comment.
We should probably make this a bit more relaxed, particularly for CI runs 🫠
There was a problem hiding this comment.
Bumped to 4. Leaving comment here for others to weigh in
| continue; | ||
| }, | ||
| }; | ||
|
|
There was a problem hiding this comment.
I think this is racy and we need atomicity in general. The current way of calling works. I'd appreciate a comment on why this works (we have the outer Mutex<()> to protect both)
There was a problem hiding this comment.
I have updated the logic to construct the notify before the db query. See the comment above that line. LMK if my understanding is incorrect.
| }; | ||
| futures.push_back(fut); | ||
| } | ||
| futures |
There was a problem hiding this comment.
FuturesOrdered doesn't have a concurrency execution limit, it only guarantees the results are in the input order (FIFO). So https://docs.rs/futures/latest/futures/stream/trait.StreamExt.html#method.buffered might a strictly better choice, to limit the optimistic proving.
There was a problem hiding this comment.
I went with adding a batch limit for the db query instead
drahnr
left a comment
There was a problem hiding this comment.
Generally like the approach/LGTM, a few edges to be smoothed
…ferred-block-proving
| - Introduce `SyncChainMmr` RPC endpoint to sync chain MMR deltas within specified block ranges ([#1591](https://github.com/0xMiden/node/issues/1591)). | ||
| - Fixed `TransactionHeader` serialization for row insertion on database & fixed transaction cursor on retrievals ([#1701](https://github.com/0xMiden/node/issues/1701)). | ||
| - Added KMS signing support in validator ([#1677](https://github.com/0xMiden/node/pull/1677)). | ||
| - Restructured block proving to be asynchronous and added finality field for `SyncChainMmr` requests ([#1725](https://github.com/0xMiden/miden-node/pull/1725)). |
There was a problem hiding this comment.
| - Restructured block proving to be asynchronous and added finality field for `SyncChainMmr` requests ([#1725](https://github.com/0xMiden/miden-node/pull/1725)). | |
| - Added finality field for `SyncChainMmr` requests ([#1725](https://github.com/0xMiden/miden-node/pull/1725)). |
| // Return data up to the latest committed block (default). | ||
| FINALITY_COMMITTED = 0; | ||
| // Return data only up to the latest proven block. | ||
| FINALITY_PROVEN = 1; |
There was a problem hiding this comment.
Is the prefix a protobuf thing? (I think so).
https://protobuf.dev/best-practices/dos-donts/#unspecified-enum:
| // Return data up to the latest committed block (default). | |
| FINALITY_COMMITTED = 0; | |
| // Return data only up to the latest proven block. | |
| FINALITY_PROVEN = 1; | |
| FINALITY_UNSPECIFIED = 0; | |
| // Return data up to the latest committed block (default). | |
| FINALITY_COMMITTED = 1; | |
| // Return data only up to the latest proven block. | |
| FINALITY_PROVEN = 2; |
|
|
||
| // The finality level to use when clamping the upper bound of the block range. | ||
| // | ||
| // When set to `FINALITY_COMMITTED` (default), the upper bound is clamped to the chain tip. |
There was a problem hiding this comment.
You shouldn't have a default value here iiuc.
| NoProvenBlocks, | ||
| #[error("database error")] | ||
| #[grpc(internal)] | ||
| DatabaseError(#[from] DatabaseError), |
There was a problem hiding this comment.
| DatabaseError(#[from] DatabaseError), | |
| DatabaseError(#[source] DatabaseError), |
|
|
||
| /// Returns the database. | ||
| pub(crate) fn db(&self) -> Arc<Db> { | ||
| self.db.clone() |
There was a problem hiding this comment.
| self.db.clone() | |
| Arc::clone(&self.db) |
| db: Arc<Db>, | ||
| block_prover: Arc<BlockProver>, | ||
| block_store: Arc<BlockStore>, | ||
| ) -> (ProofSchedulerHandle, JoinHandle<Result<(), ProofSchedulerError>>) { |
There was a problem hiding this comment.
I find the name a bit confusing since we have two handles. But maybe I just need to shift my perspective a bit.
| // Query all unproven blocks. This handles both startup recovery and new blocks. | ||
| let unproven_blocks = match db.select_unproven_blocks(MAX_PROVING_BATCH_SIZE).await { | ||
| Ok(blocks) => blocks, | ||
| Err(err) => { | ||
| error!(target: COMPONENT, %err, "Failed to query unproven blocks, retrying"); | ||
| tokio::time::sleep(Duration::from_secs(1)).await; | ||
| continue; | ||
| }, | ||
| }; |
There was a problem hiding this comment.
As mentioned elsewhere, I don't think you actually need this.
You only need to track chain_tip and latest_proven_block -- which you can do essentially for free after starting up if you replace Notify with a watch::Receiver<BlockNumber>.
| // Construct proving jobs and drain results in order. | ||
| // On any failure we break immediately — dropping remaining futures cancels them. | ||
| // The outer loop will re-query unproven blocks and restart the sequence, ensuring | ||
| // we never persist a proof while an ancestor block is still unproven. | ||
| let mut proving_futures = order_proving_jobs(&db, &block_prover, &unproven_blocks); | ||
| while let Some(timeout_result) = proving_futures.next().await { |
There was a problem hiding this comment.
I don't think this works as expected - this essentially means we submit a set of blocks for proving, and then wait until all blocks are proven.
What we want is to constantly have N blocks being proven concurrently, when one completes, another should be queued.
| Err(ProveBlockError::Transient(err)) => { | ||
| error!( | ||
| target: COMPONENT, | ||
| %err, | ||
| "Block proving failed, abandoning batch and retrying next iteration" | ||
| ); | ||
| break; | ||
| }, |
There was a problem hiding this comment.
This means a single error in the batch, restarts every block proof in the batch from scratch?
| block_prover: Arc<BlockProver>, | ||
| block_store: Arc<BlockStore>, | ||
| notify: Arc<Notify>, | ||
| ) -> Result<(), ProofSchedulerError> { |
There was a problem hiding this comment.
I would propose an alternative implementation here, which I think covers my other comments as well.
imo, what we want is:
Nongoing proofs- blocks in database must only be marked as proven in sequence
What I suggest:
- move as much as possible into a single proof future
- it auto-retries unless fatal or overall timeout is exceeded
- it saves the proof to file and essentially does everything except update the database
- central loop
- tracks
chain_tipby replacingNotifywithwatch::Receiver<BlockNumber> - tracks latest proven block number by getting it as input from outside, or reading it once from database outside the loop
- tracks a set of
tokio::JoinSetwith proof jobsJoinSetcapacity is limited toN, each is mapped to theBlockNumber- Upon task completion store result locally
BlockNumber: Result<(), Error>- Add new proof job (if any)
- If
block number = last proven + 1, then- if
Ok(())-> update database and repeat until sequence gap is hit - if
Errthen crash out
- if
- tracks
Context
We are adding deferred (asynchronous) block proving for the node, as described #1592. Currently, block proving happens synchronously during
apply_block, which means block commitment is blocked until the proof is generated.Blocks will now exhibit committed (not yet proven) and proven states. A committed block is already part of the canonical chain and fully usable. Clients that require proof-level finality can opt into it via the new
finalityparameter onSyncChainMmr.Changes
proving_inputs BLOBandis_proven BOOLEANcolumns to theblock_headerstable, with partial indexes for querying unproven (is_proven = 0) and proven (is_proven = 1) blocks.BlockStore(following the existing block file pattern) rather than as BLOBs in SQLite. The DB tracks only anis_provenflag.mark_block_proven,select_unproven_blocks(excludes genesis block 0),select_block_proving_inputs(returns deserializedBlockProofRequest), andselect_latest_proven_block_num.apply_block: TheBlockProofRequestis now serialized and persisted alongside the block duringapply_block.proof_scheduler.rs) that drives deferred proving. It queries unproven blocks on startup (restart recovery), listens for new block commits viaNotify, and proves blocks concurrently usingFuturesOrderedfor FIFO completion ordering. Proofs are saved to files, then the block is marked proven in the DB.SyncChainMmr: Added aFinalityenum (COMMITTED,PROVEN) to the protobuf and afinalityfield onSyncChainMmrRequest.apply_blockquery: IntroducedApplyBlockDatastruct to replace the 7-parameter function signature.