CI with non std functional test by Naviabheeman · Pull Request #417 · chaintope/tapyrus-core

Naviabheeman · 2026-06-01T17:24:38Z

Move the functional tests in debug mode to extended tests as debug mode CI does not run the tests at all. So some errors like debug lock order were missed in the previous testing of v0.7.2 changes. Non-standard scripts used in the debug mode are changed to other standard scripts.
A new CI runs every week to verify the long running tests like feature_pruning.

azuchi

Review of PR #417 — CI with non-std functional test

Verified scope: PR #417 head 575bd1a56, base master. Branched from 1f0759daa (PR #409 merge). 3 commits actually touch only 10 CI/test files (+472 / -163), but the branch base is before PR #411–414 were merged.

✅ Strategic intent — strongly endorsed

The PR description correctly identifies a real testing gap: debug-mode functional tests were never executed in CI, which is precisely why the cs_inventory → cs_main lock-order bug slipped through pre-merge testing of PR #414 and only surfaced in functional testing post-merge. This PR materially raises CI quality.

Content-level changes look sensible:

feature_block.py: replace bare OP_TRUE outputs with P2SH(OP_TRUE) so tests run under default policy — correct standardization
feature_cltv.py: drop -acceptnonstdtxn=1, adapt testmempoolaccept assertion — consistent with the policy-vs-consensus split
feature_csv_activation.py: remove BIP9-activation scaffolding and nFeatures=2 paths — correct (Tapyrus does not use BIP9 nor accept nFeatures=2)
feature_pruning.py + blocktools.py::create_tx_with_large_script: replace ~10 KB push-only non-standard scriptPubKey with OP_RETURN + null data — the existing test was silently producing tiny blocks and never exercising the prune path; the OP_RETURN form is standard, doesn't create UTXOs, and the -datacarriersize=10000 plumbing is correct
feature_nonstd_txn.py (new): two-node fixture (-acceptnonstdtxn=0 vs =1) verifying that mempool policy diverges while consensus accepts non-standard txns in blocks — exactly the right shape
heavy-functional-tests.yml (new): weekly schedule with 480-minute timeout, separate from daily matrix — good fit for feature_pruning.py (1.5–4 h, 4 GB disk)
daily-test.yml: invoke test_runner.py ... --debugscripts in the Debug build_type before early exit so debug-only tests actually run

🚨 Blocker — stale base, MUST rebase

The merge base of this PR is 1f0759daa ("Improve CI stability (#409)", merged before PR #411–414). All four security PRs are absent from this branch:

Missing on PR #417 head:
  src/softforkmanager.h
  src/issuedcolorids.{cpp,h}
  test/functional/feature_cp2sh_softfork.py
  (and all the consensus/script/wallet/RPC changes from #411–414)

This is why git diff master..pr417 --stat reports 65 files / +847 / -2414 — the deletions are stale-base artifacts, not the PR's intended changes (git diff $(git merge-base master pr417)..pr417 --stat shows the real scope: 10 files, +472 / -163).

Without a rebase, two real risks at merge time:

B1. After rebase, `feature_cp2sh_softfork.py` could silently disappear from CI

Master's test_runner.py:185 (added by PR #414) lists:

'feature_cp2sh_softfork.py',

in BASE_SCRIPTS. PR #417's test_runner.py does NOT list it (it was written before PR #414 existed). The two branches will conflict on test_runner.py. The rebase MUST preserve master's feature_cp2sh_softfork.py entry; resolving the conflict by taking PR #417's version wholesale will silently drop the entire CP2SH softfork test from CI.

Suggested resolution: keep the new BASE_SCRIPTS ordering from PR #417 AND re-insert feature_cp2sh_softfork.py in the appropriate group.

B2. Debug build_type CI run must have `CP2SH_ACTIVATION_TEST_HEIGHT=120` defined

PR #417's daily-test.yml change adds, in the Debug build branch:

if [ "${{ matrix.config.build_type }}" == "Debug" ]; then
    python3 test/functional/test_runner.py ... --debugscripts
    ...
fi

--debugscripts does test_list += DEBUG_MODE_SCRIPTS (line 346–347), so the actual list = BASE_SCRIPTS + DEBUG_MODE_SCRIPTS. After rebase, BASE_SCRIPTS will include feature_cp2sh_softfork.py, which requires the binary to be built with -DCP2SH_ACTIVATION_TEST_HEIGHT=120 (otherwise the softfork activates at the production height of 693367 and the test will hang / timeout / falsely pass).

Verify that the Debug build_type matrix entry in daily-test.yml also injects -DCP2SH_ACTIVATION_TEST_HEIGHT=120 into CMAKE_OPTS. If it currently only injects it for the native build path (master's daily-test.yml:547), this needs to be extended.

⚠️ Minor observations

O1. `p2p_compactblocks.py` moved from DEBUG_MODE_SCRIPTS to EXTENDED_SCRIPTS

The move is justified (compact blocks are standard and should run unconditionally), but EXTENDED tests do not run on every daily build. Given the recently-confirmed M2 finding on master (MarkBlockAsReceived leaks mapBlocksInFlight entries when MAX_CMPCTBLOCKS_INFLIGHT_PER_BLOCK is reached), please verify EXTENDED is run at least nightly so this regression class is exercised on a regular cadence.

O2. `heavy-functional-tests.yml` weekly cadence — confirm Sunday matches release rhythm

schedule: cron runs every Sunday. If release tagging happens Monday–Wednesday, the most recent heavy-test run will always be 0–2 days stale. Acceptable, but worth confirming.

O3. `feature_csv_activation.py` CSV-active-from-genesis claim should be cross-checked

The PR removes the BIP9-activation paths because "CSV is always active from genesis in Tapyrus." This is true (Tapyrus never used nVersion-bit signaling), but please verify the residual CSV tests still cover both pre/post activation regimes via timing rather than version-bit signaling. The diff removes a substantial body of bip113, bip68, bip112 test paths — confirm what remains still exercises CSV correctness.

Recommended actions

Required before merge:

Rebase onto current master
Resolve test_runner.py conflict by preserving master's feature_cp2sh_softfork.py entry alongside PR #417's reorganization (B1)
Verify the Debug build_type CI entry in daily-test.yml passes -DCP2SH_ACTIVATION_TEST_HEIGHT=120 to CMake (B2)
After rebase, post a fresh diff so reviewers can verify the security PRs (#411–414) are unaffected

Recommended:
5. Address O1 (compact-block test cadence)
6. Address O3 (residual CSV test coverage)

Verification: merge base computed via git merge-base master pr417 = 1f0759daa (PR #409). Real PR scope via git diff $(git merge-base master pr417)..pr417 --stat = 10 files. Conflict candidates via set intersection of git diff $(merge-base)..pr417 --name-only vs git diff $(merge-base)..master --name-only = daily-test.yml, test_runner.py.

…sts; add debug-mode test and CI integration - feature_block.py: replace bare OP_TRUE outputs with P2SH(OP_TRUE); fix txout_b3 uniqueness (value=2 vs b2.vtx[1] value=1 to ensure different malfixsha256); remove -acceptnonstdtxn=1 from extra_args; move to BASE_SCRIPTS - feature_cltv.py: remove -acceptnonstdtxn=1; adapt testmempoolaccept assertion; move to BASE_SCRIPTS - feature_csv_activation.py: remove nFeatures=2 txns (only features=1 is valid in Tapyrus); fix bip68timetxs (time txs pass at block 581); remove activation code (CSV is always active from genesis in Tapyrus) - p2p_compactblocks.py: remove -acceptnonstdtxn=1; move to EXTENDED_SCRIPTS - feature_nonstd_txn.py: new test with two nodes (standard vs -acceptnonstdtxn=1) verifying mempool policy differs while consensus accepts non-standard txns in blocks; added to DEBUG_MODE_SCRIPTS - test_runner.py: migrate all former DEBUG_MODE_SCRIPTS to BASE/EXTENDED_SCRIPTS; rebuild DEBUG_MODE_SCRIPTS with only feature_nonstd_txn.py - daily-test.yml: run BASE_SCRIPTS + DEBUG_MODE_SCRIPTS (--debugscripts) for debug builds before early exit, so -acceptnonstdtxn tests execute in debug CI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ETURN create_tx_with_large_script previously built a ~10 KB push-only scriptPubKey that is non-standard, causing sendrawtransaction to reject the transaction without -acceptnonstdtxn. As a result, mine_large_block silently produced tiny blocks and pruning never triggered ("does not invoke the intended test path"). Fix: replace the non-standard output with an OP_RETURN carrying MAX_SCRIPT_SIZE-10 bytes of null data. OP_RETURN is a standard null-data script (no UTXO created, no UTXO-set bloat across 9000 blocks). Add -datacarriersize=10000 to all pruning-test node args so the ~10 KB OP_RETURN passes mempool standardness checks. Re-enable feature_pruning.py in EXTENDED_SCRIPTS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feature_pruning.py takes 1.5–4 hours and uses 4 GB disk — too long for the daily CI matrix. Move it from EXTENDED_SCRIPTS into a new HEAVY_SCRIPTS category and create a dedicated weekly workflow (heavy-functional-tests.yml) that runs every Sunday on Linux x86_64 RelWithDebInfo with GUI+USDT+Wallet. The workflow shares ccache and depends caches with the daily-test job so rebuilds are fast. A commented-out USDT section is scaffolded for future use. Update the missed_tests checker to cover HEAVY_SCRIPTS and USDT_SCRIPTS so no test file goes untracked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

azuchi

Re-review after rebase (head `90eb83ba8`)

✅ Prior blockers resolved

B1: test_runner.py:191 preserves feature_cp2sh_softfork.py in BASE_SCRIPTS ✓
B2: daily-test.yml:551 injects -DCP2SH_ACTIVATION_TEST_HEIGHT=120 for all native builds; :698 runs --debugscripts in the Debug branch ✓
Rebase clean (merge base now 54a27e153, diff scope 10 files / +472 / -163)

🚨 New blocker — Smoke Test #600 fails on 4/6 jobs

Commit 2f66ee7ae rewrote test_framework/blocktools.py::create_tx_with_large_script to emit OP_RETURN [MAX_SCRIPT_SIZE - 10] (≈10 KB). MAX_OP_RETURN_RELAY defaults to 83 bytes (src/script/standard.h:42), so the tx is rejected by mempool standardness unless -datacarriersize=10000 is set.

Only one of three callers was updated:

Caller	Updated?
`test/functional/feature_pruning.py`	✅ adds `-datacarriersize=10000`
`test/functional/feature_largeblocksize.py`	❌ — causes the Smoke Test failure (`AssertionError: <txid> not found in mempool` at line 102)
`test/functional/feature_reindex.py`	❌ — masked by `--failfast`, but same vector

Fix: pass -datacarriersize=10000 to nodes in feature_largeblocksize.py::set_test_params and feature_reindex.py (same pattern as feature_pruning.py).

Failing jobs: Linux x86_64, macOS ARM64, macOS x86_64, Linux ARM64.

…largeblocksize to allow large scripts and thereafter large transactions

Naviabheeman requested a review from azuchi June 1, 2026 17:24

Naviabheeman assigned azuchi Jun 1, 2026

azuchi requested changes Jun 2, 2026

View reviewed changes

Naviabheeman and others added 3 commits June 2, 2026 22:00

Naviabheeman force-pushed the CIwithNonStdFunctionalTest branch from 575bd1a to 90eb83b Compare June 2, 2026 16:31

Naviabheeman requested a review from azuchi June 2, 2026 16:48

azuchi requested changes Jun 3, 2026

View reviewed changes

Review feedback - add datacarriersize to feature_reindex and feature_…

6a30efc

…largeblocksize to allow large scripts and thereafter large transactions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI with non std functional test#417

CI with non std functional test#417
Naviabheeman wants to merge 4 commits into
chaintope:masterfrom
Naviabheeman:CIwithNonStdFunctionalTest

Naviabheeman commented Jun 1, 2026

Uh oh!

azuchi left a comment

Uh oh!

azuchi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Naviabheeman commented Jun 1, 2026

Uh oh!

azuchi left a comment

Choose a reason for hiding this comment

Review of PR #417 — CI with non-std functional test

✅ Strategic intent — strongly endorsed

🚨 Blocker — stale base, MUST rebase

B1. After rebase, feature_cp2sh_softfork.py could silently disappear from CI

B2. Debug build_type CI run must have CP2SH_ACTIVATION_TEST_HEIGHT=120 defined

⚠️ Minor observations

O1. p2p_compactblocks.py moved from DEBUG_MODE_SCRIPTS to EXTENDED_SCRIPTS

O2. heavy-functional-tests.yml weekly cadence — confirm Sunday matches release rhythm

O3. feature_csv_activation.py CSV-active-from-genesis claim should be cross-checked

Recommended actions

Uh oh!

azuchi left a comment

Choose a reason for hiding this comment

Re-review after rebase (head 90eb83ba8)

✅ Prior blockers resolved

🚨 New blocker — Smoke Test #600 fails on 4/6 jobs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

B1. After rebase, `feature_cp2sh_softfork.py` could silently disappear from CI

B2. Debug build_type CI run must have `CP2SH_ACTIVATION_TEST_HEIGHT=120` defined

O1. `p2p_compactblocks.py` moved from DEBUG_MODE_SCRIPTS to EXTENDED_SCRIPTS

O2. `heavy-functional-tests.yml` weekly cadence — confirm Sunday matches release rhythm

O3. `feature_csv_activation.py` CSV-active-from-genesis claim should be cross-checked

Re-review after rebase (head `90eb83ba8`)