feat(commonware): upgrade to 2026.5.0#1
Draft
erenyegit wants to merge 1 commit into
Draft
Conversation
Migrate the workspace from commonware 2026.4.0 to the 2026.5.0 release.
Major API migrations addressed:
- Storage: merkle::journaled -> merkle::full, QMDB VariableConfig and Db
gain S: Strategy generic (Sequential)
- Runtime: tokio::Context no longer implements Clone; with_label and
clone replaced with Supervisor::child("label"); Metrics::register now
returns Registered<M>
- Reporter, Blocker, and Manager are sync and return commonware_actor
Feedback instead of impl Future; manual Clone impls on FinalizedReporter,
SeedReporter, and CommonwareRootProvider use context.child("clone_label")
- Application trait: genesis method removed; verify moved into Application
(VerifyingApplication trait removed); propose/verify now take impl
Ancestry<Self::Block> instead of AncestorStream<A, B> with BlockProvider
- simplex::Config gained floor: Floor<S, D>; marshal::Config gained
start: Start<P::Scheme, B::Digest, B>; marshal ActorInitializer init
and init_with_partition now take a start parameter
- Sender::send returns Vec<PublicKey>; .await dropped on sync API calls
- Ed25519 internals vendored: PrivateKey constructed via ReadExt::read
instead of ed25519_consensus::SigningKey::from(seed)
- bls12381::dkg types moved under feldman_desmedt submodule
Cross-referenced will's fix/commonware-resolver-upgrade branch on
Nunchi-trade/daeji as a migration reference; simplex config/engine,
marshal peers/broadcast, and transport-sim copied wholesale from there.
Validation:
- cargo build --workspace and cargo test --workspace --no-run pass
- kora-marshal integration tests pass
- just trusted-devnet boots 4 validators + 1 secondary peer healthy and
finalizes at around 3 blocks/sec on an I/O-degraded host
Known issue:
kora-e2e tests::consensus::test_empty_blocks (and any e2e test using
TestHarness::run) overflow the test thread stack post-upgrade. Likely
cause: manual Clone impls in reporters and root-provider use
context.child("xxx_clone") since released 2026.5.0 does not have
Context: Clone, and the supervision tree appears to grow over the test's
event-loop iterations until the default 8MB test stack is exhausted.
Daeji avoids this by patching commonware to main (where Context: Clone
still exists). Three follow-ups under consideration: (a) refactor the
Clone impls to Arc<Context> or similar to avoid tree growth, (b) match
daeji and apply [patch.crates-io] to main, (c) bump RUST_MIN_STACK in
test runners. Not adding #[ignore] markers per guidance to let the
failure be visible. 18 other kora-e2e tests are already marked
#[ignore] as pre-existing flaky-in-parallel.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Migrates the kora workspace from commonware
2026.4.0to the2026.5.0release.Why it's a draft
kora-e2e tests::consensus::test_empty_blocks(and any e2e test that usesTestHarness::run) overflows the test thread stack post-upgrade. The build is clean and the trusted-devnet smoke run finalizes, but this is a real regression I want to leave visible while we decide on the right fix. See the "Known issue" section below.Major API migrations
merkle::journaled→merkle::full. QMDBVariableConfigandDbgainS: Strategygeneric (usingSequential).tokio::Contextno longer implementsClone.with_labelandclonereplaced withSupervisor::child("label").Metrics::registernow returnsRegistered<M>.commonware_actor::Feedbackinstead ofimpl Future. ManualCloneimpls onFinalizedReporter,SeedReporter, andCommonwareRootProviderusecontext.child("xxx_clone").genesisremoved.verifymoved intoApplication(VerifyingApplicationtrait removed).propose/verifytakeimpl Ancestry<Self::Block>instead ofAncestorStream<A, B>withBlockProvider.simplex::Configgainedfloor: Floor<S, D>;marshal::Configgainedstart: Start<P::Scheme, B::Digest, B>.ActorInitializer::init/init_with_partitionnow take astartparameter.Sender::sendreturnsVec<PublicKey>;.awaitdropped on sync API calls.PrivateKeyconstructed viaReadExt::read(wased25519_consensus::SigningKey::from(seed)).bls12381::dkgtypes moved under thefeldman_desmedtsubmodule.Cross-referenced will's
fix/commonware-resolver-upgradebranch onNunchi-trade/daejias a migration reference.simplexconfig/engine,marshalpeers/broadcast, andtransport-simare copied wholesale from there.Validation
cargo build --workspacecleancargo test --workspace --no-runcompiles all test targetskora-marshalintegration tests passjust trusted-devnetbrings up 4 validators + 1 secondary peer all healthy and finalizes at around 3 blocks/sec on this (I/O-degraded) hostKnown issue: e2e stack overflow
tests::consensus::test_empty_blocksand any other test that goes throughTestHarness::runoverflows the test thread's 8MB stack.Likely cause: our manual
Cloneimpls onFinalizedReporter,SeedReporter, andCommonwareRootProvidercallcontext.child("xxx_clone")because released2026.5.0no longer hasContext: Clone. Over the test's event-loop iterations the supervision tree appears to grow until the default stack is exhausted.daeji avoids this by patching commonware to
main(whereContext: Clonestill exists). Three follow-ups under consideration:Cloneimpls to useArc<Context>or similar so cloning doesn't grow the supervision tree.[patch.crates-io]to point commonware atmain.RUST_MIN_STACKfor the test binary.Per request, not marking the failing test
#[ignore]so the regression is visible. For context, 18 otherkora-e2etests are already#[ignore]'d as pre-existing flaky-in-parallel.Follow-ups (separate PRs)
spammoor-style).