feat(e2e): add benchmark suite for mempool, and state db#149
feat(e2e): add benchmark suite for mempool, and state db#149
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a reproducible end-to-end benchmarking suite: a CosmWasm benchmark contract, a multi-node cluster harness, multiple load generators (including pre-signed flows), result collection/reporting utilities, and many benchmark tests comparing mempool, state DB, queued behavior, gossip, and Wasm execution scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Runner
participant Cluster as Cluster Harness
participant Nodes as Node(s)
participant Load as Load Generator
participant Collector as Collector
participant Reporter as Reporter
Test->>Cluster: NewCluster() / Start()
Cluster->>Nodes: initialize & start nodes
Test->>Load: start load (Burst/Sequential/PreSigned)
Load->>Nodes: submit txs (BankSend / WasmExec / Broadcast)
Nodes->>Nodes: mempool accept & gossip
Load-->>Collector: record submissions & timestamps
Test->>Cluster: WaitForAllIncluded()
Collector->>Nodes: query blocks per height
Collector-->>Reporter: BenchResult (metrics, peak)
Test->>Reporter: WriteResult / PrintComparisonTable
sequenceDiagram
participant Test as Test Runner
participant Cluster as Cluster Harness
participant Chain as Blockchain
participant Contract as bench_heavy_state
Test->>Cluster: StoreWasmContract(wasm)
Cluster->>Chain: submit store tx
Chain-->>Cluster: return CodeID
Test->>Cluster: InstantiateWasmContract(CodeID)
Chain-->>Cluster: return ContractAddr
Test->>Cluster: ExecuteWasmContract(contractAddr, WriteMixed)
Contract->>Contract: load nonce → inc → write SHARED_STATE & LOCAL_STATE
Contract-->>Chain: execution response (events)
Chain->>Collector: include tx in block (queried later)
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
integration-tests/e2e/benchmark/benchmark_test.go (2)
96-104: Potential double-close of cluster.
setupClusterregisterst.Cleanup(cl.Close)at Line 48, butrunBenchmarkalso callsdefer cl.Close()at Line 101. SinceCluster.Close()has internal guarding (c.closedflag), this won't cause a crash, but the explicit defer is redundant.♻️ Remove redundant defer
func runBenchmark(t *testing.T, cfg BenchConfig, loadFn func(ctx context.Context, cl *cluster.Cluster, cfg BenchConfig, metas map[string]cluster.AccountMeta) LoadResult) BenchResult { t.Helper() ctx := context.Background() cl := setupCluster(t, ctx, cfg) - defer cl.Close() return runBenchmarkWithCluster(t, ctx, cl, cfg, loadFn) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 96 - 104, The runBenchmark function redundantly defers cl.Close() even though setupCluster already registers t.Cleanup(cl.Close); remove the explicit defer cl.Close() from runBenchmark so cleanup is only handled via setupCluster's t.Cleanup registration (referencing runBenchmark and setupCluster and Cluster.Close to locate the code).
184-186: Consider replacingtime.Sleepwith deterministic waits.The 3-second sleeps after
WaitForMempoolEmptyare intended to ensure the tx is fully committed. This can be flaky in slow CI environments. Consider pollingQueryTxResultuntil success, or waiting for a specific block height advancement.Also applies to: 199-201
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 184 - 186, Replace the ad-hoc time.Sleep(3 * time.Second) after cl.WaitForMempoolEmpty with a deterministic poll that waits until the transaction is committed: call cl.QueryTxResult (or the client method that queries tx by hash) in a loop with a short backoff and timeout (e.g., check every 200–500ms up to a few seconds) and break when the query returns success/confirmed; do this for the occurrence after WaitForMempoolEmpty and the same pattern at the other location (around lines 199–201) so the test waits for concrete tx confirmation or block height advancement instead of a fixed sleep.integration-tests/e2e/benchmark/README.md (1)
264-269: Add language identifiers to fenced code blocks.Static analysis flagged these code blocks as missing language specifiers. For the table example, use
textorplaintext; for the directory structure, usetextas well.🔧 Suggested fix
Line 264:
-``` +```text Config | Variant | TPS | vs base | P50ms | vs base | P95ms | vs base | Peak MempoolLine 301:
-``` +```text benchmark/ config.go Variant definitions, BenchConfig, preset constructorsAlso applies to: 301-310
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 264 - 269, Update the fenced code blocks in the README so they include a language identifier (use "text" or "plaintext"); specifically change the triple-backtick markers before the performance table line starting with "Config | Variant" to ```text and likewise change the triple-backtick before the directory/listing block that begins with "benchmark/" to ```text so static analysis no longer flags missing language specifiers.integration-tests/e2e/cluster/cluster.go (1)
318-321: Consider adding timeout to HTTP client for reliability.
http.DefaultClienthas no timeout by default. While acceptable for local E2E tests, a hung RPC endpoint could block indefinitely. Consider using a client with a timeout.♻️ Optional: Add timeout
var httpClient = &http.Client{ Timeout: 30 * time.Second, }Then replace
http.DefaultClient.Do(req)withhttpClient.Do(req)throughout.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 318 - 321, The code uses http.DefaultClient.Do(req) which has no timeout and can hang; create a package-level or file-level httpClient variable (e.g., httpClient := &http.Client{Timeout: 30 * time.Second}) and replace calls to http.DefaultClient.Do(req) with httpClient.Do(req) (update any functions that call DefaultClient, e.g., the function containing the resp, err := http.DefaultClient.Do(req) line) so HTTP requests fail fast instead of blocking indefinitely.integration-tests/e2e/benchmark/report.go (1)
138-153: Consider logging or returning errors fromLoadBaselineResultsByLabel.Silently returning
nilwhenLoadResultsfails (Line 141-143) makes it indistinguishable from "no matching baselines found." Callers may want to know if the directory couldn't be read vs. simply being empty.♻️ Optional: Return error or log
-func LoadBaselineResultsByLabel(dir, label string) []BenchResult { +func LoadBaselineResultsByLabel(dir, label string) ([]BenchResult, error) { results, err := LoadResults(dir) if err != nil { - return nil + return nil, err } var matched []BenchResult for _, r := range results { if r.Config.Variant == VariantBaseline && r.Config.Label == label { matched = append(matched, r) } } - return matched + return matched, nil }Alternatively, keep the current signature but log the error:
func LoadBaselineResultsByLabel(dir, label string) []BenchResult { results, err := LoadResults(dir) if err != nil { + // Log to help diagnose missing baselines vs. read failures + fmt.Fprintf(os.Stderr, "warning: failed to load results from %s: %v\n", dir, err) return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 138 - 153, LoadBaselineResultsByLabel currently swallows errors from LoadResults by returning nil, making failures indistinguishable from “no matches”; change its signature to return ([]BenchResult, error) so you can propagate the error from LoadResults (i.e., return nil, err when LoadResults fails) and update all callers to handle the error, or if you must keep the signature instead log the LoadResults error before returning an empty slice; locate the function LoadBaselineResultsByLabel and the call to LoadResults to implement the chosen fix and adjust any caller code accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/config.go`:
- Line 28: The JSON tag for the struct field TimeoutCommit currently declares
milliseconds but time.Duration (TimeoutCommit) is marshaled as nanoseconds,
causing unit mismatch; fix by making the JSON tag reflect nanoseconds (e.g.,
change `json:"timeout_commit_ms,omitempty"` to
`json:"timeout_commit_ns,omitempty"`) or alternatively implement custom
MarshalJSON/UnmarshalJSON on the type that converts between milliseconds and
time.Duration so the external JSON unit matches the tag; locate the
TimeoutCommit field and update the tag or add custom (Un)MarshalJSON methods for
the type to perform the proper unit conversion.
In `@integration-tests/e2e/cluster/ports.go`:
- Around line 23-30: The computed start port (start := basePort + (index *
stride)) can yield values where start+3 > 65535, producing invalid ports; add an
upper-bound check after computing start in the function that builds and returns
NodePorts (using variables basePort, index, stride, start and the NodePorts
return) and if start < 0 or start+3 > 65535 return a clear error instead of
constructing NodePorts; ensure the function returns that error so callers fail
early rather than attempting to bind invalid ports.
In `@integration-tests/go.mod`:
- Around line 155-157: Update the pinned grpc module version from
google.golang.org/grpc v1.65.0 to the patched release v1.79.3 in go.mod to
address GHSA-p77j-4mvh-x3m3; after updating the version string for the module
dependency, run dependency cleanup (e.g., go mod tidy) and run integration tests
to verify compatibility with the cosmos/relayer code paths that previously
required v1.65.0.
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 96-104: The runBenchmark function redundantly defers cl.Close()
even though setupCluster already registers t.Cleanup(cl.Close); remove the
explicit defer cl.Close() from runBenchmark so cleanup is only handled via
setupCluster's t.Cleanup registration (referencing runBenchmark and setupCluster
and Cluster.Close to locate the code).
- Around line 184-186: Replace the ad-hoc time.Sleep(3 * time.Second) after
cl.WaitForMempoolEmpty with a deterministic poll that waits until the
transaction is committed: call cl.QueryTxResult (or the client method that
queries tx by hash) in a loop with a short backoff and timeout (e.g., check
every 200–500ms up to a few seconds) and break when the query returns
success/confirmed; do this for the occurrence after WaitForMempoolEmpty and the
same pattern at the other location (around lines 199–201) so the test waits for
concrete tx confirmation or block height advancement instead of a fixed sleep.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 264-269: Update the fenced code blocks in the README so they
include a language identifier (use "text" or "plaintext"); specifically change
the triple-backtick markers before the performance table line starting with
"Config | Variant" to ```text and likewise change the
triple-backtick before the directory/listing block that begins with "benchmark/"
to ```text so static analysis no longer flags missing language specifiers.
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 138-153: LoadBaselineResultsByLabel currently swallows errors from
LoadResults by returning nil, making failures indistinguishable from “no
matches”; change its signature to return ([]BenchResult, error) so you can
propagate the error from LoadResults (i.e., return nil, err when LoadResults
fails) and update all callers to handle the error, or if you must keep the
signature instead log the LoadResults error before returning an empty slice;
locate the function LoadBaselineResultsByLabel and the call to LoadResults to
implement the chosen fix and adjust any caller code accordingly.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 318-321: The code uses http.DefaultClient.Do(req) which has no
timeout and can hang; create a package-level or file-level httpClient variable
(e.g., httpClient := &http.Client{Timeout: 30 * time.Second}) and replace calls
to http.DefaultClient.Do(req) with httpClient.Do(req) (update any functions that
call DefaultClient, e.g., the function containing the resp, err :=
http.DefaultClient.Do(req) line) so HTTP requests fail fast instead of blocking
indefinitely.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 48b9cf25-2b87-43c2-92e8-d29a5e83d40e
⛔ Files ignored due to path filters (2)
integration-tests/e2e/benchmark/bench_heavy_state/Cargo.lockis excluded by!**/*.lockintegration-tests/go.sumis excluded by!**/*.sum
📒 Files selected for processing (15)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/bench_heavy_state/Cargo.tomlintegration-tests/e2e/benchmark/bench_heavy_state/src/error.rsintegration-tests/e2e/benchmark/bench_heavy_state/src/lib.rsintegration-tests/e2e/benchmark/bench_heavy_state/src/msg.rsintegration-tests/e2e/benchmark/bench_heavy_state/src/state.rsintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/collector.gointegration-tests/e2e/benchmark/config.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster/cluster.gointegration-tests/e2e/cluster/ports.gointegration-tests/e2e/cluster/toml.gointegration-tests/go.mod
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #149 +/- ##
=======================================
Coverage 42.03% 42.03%
=======================================
Files 49 49
Lines 3245 3245
=======================================
Hits 1364 1364
Misses 1743 1743
Partials 138 138 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 141-146: When WaitForAllIncluded returns an error you currently
call cl.LatestHeight and ignore its error, which can leave endHeight==0 and
cause CollectResults to run with a bogus range; change the fallback to check the
error returned by cl.LatestHeight(ctx, 0) and fail the test (t.Fatalf or return
the error) if LatestHeight also fails, otherwise assign endHeight from the
successful result; reference WaitForAllIncluded, cl.LatestHeight and
CollectResults to locate the logic and ensure the fallback does not silently
swallow errors.
- Line 70: Tests create a MempoolPoller via NewMempoolPoller (e.g., in
runBenchmarkWithCluster, runPreSignedBenchmark, TestBenchmarkGossipPropagation,
TestBenchmarkGapEviction) but never defer poller.Stop(), so the poller goroutine
can leak if require.NoError(WaitForAllIncluded(...)) fails; fix this by calling
defer poller.Stop() immediately after each NewMempoolPoller(...) invocation so
the background sampling goroutine is always cleaned up regardless of test
failures.
- Around line 518-519: SingleNodeLoad is currently targeting the validator (node
0), bypassing gossip; update the call to target an edge fullnode by passing
cfg.ValidatorCount as the target index (i.e., replace the final argument 0 with
cfg.ValidatorCount), or alternatively modify SingleNodeLoad to internally call
edgeNodeIndex(target, cfg) (the same logic used by BurstLoad and SequentialLoad)
so when cfg.ValidatorCount > 0 and < cfg.NodeCount it selects the first edge
fullnode instead of a validator.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 54-55: Update the README so the stress-case description and
reported metrics match the actual test implementation: either change the
`TestBenchmarkPreSignedSeqWasmExecStress` entry in the table to state "20 accts
x 200 txs, 100 writes/tx" (because the test uses sharedWrites=20 and
localWrites=80) or change the test to use 200 writes/tx; also correct the P50
values quoted near the paragraph around lines 137-140 so they match the table
above, and make the same consistency fix at the second occurrence near lines
344-345; reference the test name `TestBenchmarkPreSignedSeqWasmExecStress` and
the implementation parameters `sharedWrites` and `localWrites` when making the
change.
- Around line 160-165: The README omits copying the compiled Wasm into the
expected filename and path, causing setupWasmExecCluster to fail looking for
bench_heavy_state/contract.wasm; after building with cargo (target
wasm32-unknown-unknown --release), add an instruction to copy or move
target/wasm32-unknown-unknown/release/bench_heavy_state.wasm to
bench_heavy_state/contract.wasm (or create bench_heavy_state/contract.wasm by
renaming the artifact) so the setupWasmExecCluster lookup succeeds; mention this
step in both the earlier snippet and the similar block around lines 346-350
referencing bench_heavy_state and contract.wasm.
- Around line 264-269: The fenced code blocks containing the benchmark tables
(the blocks starting with "Config | Variant | TPS | vs base | P50ms ..." in
integration-tests/e2e/benchmark/README.md) are missing a language tag and
trigger markdownlint MD040; update both fenced blocks to include a language
identifier such as "text" or "plaintext" (e.g., ```text) to silence the lint
warning and keep formatting, and apply the same change to the other fenced block
mentioned (the table at lines 301-310).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c28797e1-0d70-4827-ab29-058d73b152d5
📒 Files selected for processing (2)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.go
There was a problem hiding this comment.
♻️ Duplicate comments (4)
integration-tests/e2e/benchmark/README.md (4)
264-269:⚠️ Potential issue | 🟡 MinorAdd language identifiers to fenced code blocks.
Both example output blocks (lines 264-269 and 301-310) are missing language specifiers, triggering markdownlint MD040 warnings. Add
textorplaintextto silence the linter.📝 Proposed fix
-``` +```text Config | Variant | TPS | vs base | P50ms | vs base | P95ms | vs base | Peak MempoolApply the same fix to the fenced block at lines 301-310.
Also applies to: 301-310
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 264 - 269, Add a language identifier (e.g., "text" or "plaintext") to the fenced code blocks that contain the benchmark tables in integration-tests/e2e/benchmark/README.md so markdownlint MD040 is silenced; specifically update the three-backtick openings for the table blocks beginning around the lines showing "Config | Variant | TPS ..." (the two fenced blocks that display the benchmark tables) to use ```text and ensure both the block at the one starting with "Config | Variant | TPS ..." and the second similar block further down are updated the same way.
137-140:⚠️ Potential issue | 🟡 MinorCorrect the P50 latency values to match the table above.
Line 139 quotes P50 values as "(1934 vs 3854ms)" but the table at lines 134-135 shows IAVL P50 = 2047ms and MemIAVL P50 = 1322ms. The quoted values do not match the measured results.
📝 Proposed fix
-At 4000 txs (100 writes/tx): **+96.7% TPS** (795 vs 404) and **-35.4% P50 latency** (1934 vs 3854ms). +At 4000 txs (100 writes/tx): **+96.7% TPS** (795 vs 404) and **-35.4% P50 latency** (1322 vs 2047ms).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 137 - 140, Update the P50 latency numbers in the sentence that currently reads "(1934 vs 3854ms)" so they match the benchmark table above: replace that fragment with "(2047 vs 1322ms)" to reflect IAVL P50 = 2047ms and MemIAVL P50 = 1322ms; ensure the order corresponds to "IAVL vs MemIAVL" as used elsewhere in the paragraph.
160-165:⚠️ Potential issue | 🟠 MajorAdd the contract.wasm copy step after building.
The build instructions show
cargo buildbut omit the required step to copy the artifact to the expected location. The test harness (setupWasmExecCluster) looks forbench_heavy_state/contract.wasm, but the cargo build leaves the artifact attarget/wasm32-unknown-unknown/release/bench_heavy_state.wasm. Without this step, all Wasm exec benchmarks will fail at cluster setup.📝 Proposed fix
cd e2e/benchmark/bench_heavy_state cargo build --target wasm32-unknown-unknown --release +cp target/wasm32-unknown-unknown/release/bench_heavy_state.wasm contract.wasmApply the same fix at lines 346-350.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 160 - 165, After building the CosmWasm contract with cargo in bench_heavy_state, add a step to copy the built artifact from target/wasm32-unknown-unknown/release/bench_heavy_state.wasm to bench_heavy_state/contract.wasm so the test harness (setupWasmExecCluster) can find it; add the same copy instruction in the other README section that mirrors this one to ensure both places produce bench_heavy_state/contract.wasm for the Wasm exec benchmarks.
344-344:⚠️ Potential issue | 🟡 MinorFix the inconsistent stress test configuration.
Line 344 states "Stress tests use
write_mixed{40, 160}= 200 writes/tx", but Line 55 in the comparison matrix documentsTestBenchmarkPreSignedSeqWasmExecStressas using "100 writes/tx". This inconsistency will confuse users trying to understand or reproduce the stress benchmarks.📝 Proposed fix
-CLI-based tests use `write_mixed{5, 25}` = 30 writes/tx. Pre-signed HTTP tests use `write_mixed{20, 80}` = 100 writes/tx. Stress tests use `write_mixed{40, 160}` = 200 writes/tx. +CLI-based tests use `write_mixed{5, 25}` = 30 writes/tx. Pre-signed HTTP tests use `write_mixed{20, 80}` = 100 writes/tx. Stress tests use `write_mixed{20, 80}` = 100 writes/tx.Alternatively, if stress tests are meant to use 200 writes/tx, update Line 55 instead:
-| `TestBenchmarkPreSignedSeqWasmExecStress` | Sequential, Wasm exec, HTTP (stress) | 20 accts x 200 txs, 100 writes/tx | +| `TestBenchmarkPreSignedSeqWasmExecStress` | Sequential, Wasm exec, HTTP (stress) | 20 accts x 200 txs, 200 writes/tx |🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` at line 344, The README has inconsistent documentation for stress test write counts: the description "Stress tests use `write_mixed{40, 160}` = 200 writes/tx" conflicts with the comparison matrix entry for TestBenchmarkPreSignedSeqWasmExecStress which lists 100 writes/tx; update the documentation so both places match by choosing the correct intended value and editing either the README line mentioning `write_mixed{40, 160}` or the comparison matrix entry for TestBenchmarkPreSignedSeqWasmExecStress to the same write count (100 or 200), ensuring the string `write_mixed{...}` and the numeric "writes/tx" are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 264-269: Add a language identifier (e.g., "text" or "plaintext")
to the fenced code blocks that contain the benchmark tables in
integration-tests/e2e/benchmark/README.md so markdownlint MD040 is silenced;
specifically update the three-backtick openings for the table blocks beginning
around the lines showing "Config | Variant | TPS ..." (the two fenced blocks
that display the benchmark tables) to use ```text and ensure both the block at
the one starting with "Config | Variant | TPS ..."
and the second similar block further down are updated the same way.
- Around line 137-140: Update the P50 latency numbers in the sentence that
currently reads "(1934 vs 3854ms)" so they match the benchmark table above:
replace that fragment with "(2047 vs 1322ms)" to reflect IAVL P50 = 2047ms and
MemIAVL P50 = 1322ms; ensure the order corresponds to "IAVL vs MemIAVL" as used
elsewhere in the paragraph.
- Around line 160-165: After building the CosmWasm contract with cargo in
bench_heavy_state, add a step to copy the built artifact from
target/wasm32-unknown-unknown/release/bench_heavy_state.wasm to
bench_heavy_state/contract.wasm so the test harness (setupWasmExecCluster) can
find it; add the same copy instruction in the other README section that mirrors
this one to ensure both places produce bench_heavy_state/contract.wasm for the
Wasm exec benchmarks.
- Line 344: The README has inconsistent documentation for stress test write
counts: the description "Stress tests use `write_mixed{40, 160}` = 200
writes/tx" conflicts with the comparison matrix entry for
TestBenchmarkPreSignedSeqWasmExecStress which lists 100 writes/tx; update the
documentation so both places match by choosing the correct intended value and
editing either the README line mentioning `write_mixed{40, 160}` or the
comparison matrix entry for TestBenchmarkPreSignedSeqWasmExecStress to the same
write count (100 or 200), ensuring the string `write_mixed{...}` and the numeric
"writes/tx" are consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9414d419-9409-4ae0-96a9-25c93db6d690
📒 Files selected for processing (1)
integration-tests/e2e/benchmark/README.md
There was a problem hiding this comment.
🧹 Nitpick comments (4)
integration-tests/e2e/benchmark/benchmark_test.go (2)
683-700: MemIAVL stress subtest missingDrainTimeoutOverride.The IAVL subtest (line 667) sets
DrainTimeoutOverride = 10 * time.Minute, but the MemIAVL subtest doesn't. With the default formula (180s + 4000/20s = 380s ≈ 6.3min), MemIAVL gets a shorter timeout window.If this is intentional because MemIAVL is expected to complete faster, consider adding a comment. Otherwise, for consistency:
t.Run("MemIAVL", func(t *testing.T) { cfg := CombinedConfig() cfg.AccountCount = 20 cfg.TxPerAccount = 200 + cfg.DrainTimeoutOverride = 10 * time.Minute cfg.Label = "presigned-stress/memiavl/seq-wasm-exec"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 683 - 700, The MemIAVL subtest ("MemIAVL") omits setting DrainTimeoutOverride on the CombinedConfig returned by CombinedConfig(), causing it to use a shorter default timeout; update the MemIAVL test to set cfg.DrainTimeoutOverride = 10 * time.Minute (same as the IAVL subtest) to match timeouts, or if shorter timeout is intentional, add a one-line comment above the MemIAVL block explaining why it differs; refer to the local variable cfg in the MemIAVL test and the CombinedConfig/DrainTimeoutOverride configuration option to locate where to apply the change.
96-104: Redundantdefer cl.Close()call.
setupClusteralready registerscl.Closeviat.Cleanup(line 48), so the explicitdefer cl.Close()here is redundant. AssumingClose()is idempotent this won't cause issues, but it adds noise.♻️ Suggested cleanup
func runBenchmark(t *testing.T, cfg BenchConfig, loadFn func(ctx context.Context, cl *cluster.Cluster, cfg BenchConfig, metas map[string]cluster.AccountMeta) LoadResult) BenchResult { t.Helper() ctx := context.Background() cl := setupCluster(t, ctx, cfg) - defer cl.Close() return runBenchmarkWithCluster(t, ctx, cl, cfg, loadFn) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 96 - 104, In runBenchmark, remove the redundant explicit defer cl.Close() because setupCluster already registers cl.Close via t.Cleanup; locate the function runBenchmark and delete the line "defer cl.Close()" so cleanup relies on the t.Cleanup registration from setupCluster (no other changes needed).integration-tests/e2e/benchmark/README.md (2)
302-311: Add language identifier to fenced code block.Same issue as the previous block—this directory structure listing needs a language specifier.
-``` +```text benchmark/ config.go Variant definitions, BenchConfig, preset constructors🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 302 - 311, The fenced code block that shows the benchmark/ directory listing in README.md lacks a language identifier; update the opening fence for the block that begins with "benchmark/" to use a language specifier (e.g., change ``` to ```text) so the block becomes a text-formatted code block in the README.
265-270: Add language identifier to fenced code block.Per static analysis (MD040), this fenced code block needs a language specifier. Since it's tabular text output,
textorplaintextis appropriate.-``` +```text Config | Variant | TPS | vs base | P50ms | vs base | P95ms | vs base | Peak Mempool🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 265 - 270, The fenced code block containing the benchmark table in README.md is missing a language identifier; update the opening fence (the triple backticks that precede the table) to include a language specifier such as text or plaintext (e.g., change ``` to ```text) so the block passes MD040 static analysis and renders correctly; locate the fenced code block that contains the "Config | Variant | TPS ..." table and add the language identifier to its opening backticks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 683-700: The MemIAVL subtest ("MemIAVL") omits setting
DrainTimeoutOverride on the CombinedConfig returned by CombinedConfig(), causing
it to use a shorter default timeout; update the MemIAVL test to set
cfg.DrainTimeoutOverride = 10 * time.Minute (same as the IAVL subtest) to match
timeouts, or if shorter timeout is intentional, add a one-line comment above the
MemIAVL block explaining why it differs; refer to the local variable cfg in the
MemIAVL test and the CombinedConfig/DrainTimeoutOverride configuration option to
locate where to apply the change.
- Around line 96-104: In runBenchmark, remove the redundant explicit defer
cl.Close() because setupCluster already registers cl.Close via t.Cleanup; locate
the function runBenchmark and delete the line "defer cl.Close()" so cleanup
relies on the t.Cleanup registration from setupCluster (no other changes
needed).
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 302-311: The fenced code block that shows the benchmark/ directory
listing in README.md lacks a language identifier; update the opening fence for
the block that begins with "benchmark/" to use a language specifier (e.g.,
change ``` to ```text) so the block becomes a text-formatted code block in the
README.
- Around line 265-270: The fenced code block containing the benchmark table in
README.md is missing a language identifier; update the opening fence (the triple
backticks that precede the table) to include a language specifier such as text
or plaintext (e.g., change ``` to ```text) so the block passes MD040 static
analysis and renders correctly; locate the fenced code block that contains the
"Config | Variant | TPS ..." table and add the language identifier to its
opening backticks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0f3c5f9f-de84-470e-a48f-d07a0dbc129c
📒 Files selected for processing (2)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.go
There was a problem hiding this comment.
♻️ Duplicate comments (2)
integration-tests/e2e/benchmark/README.md (2)
302-311:⚠️ Potential issue | 🟡 MinorAdd language identifier to the fenced code block.
This fenced block is missing a language specifier, triggering markdownlint MD040. Adding
textorplaintextwill resolve the lint warning.📝 Suggested fix
-``` +```text benchmark/As per coding guidelines, this was flagged in a previous review but remains unaddressed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 302 - 311, The fenced directory listing in the README.md is missing a language identifier causing markdownlint MD040; update the fenced code block around the directory tree (the block that starts with ``` and contains "benchmark/ config.go load.go ...") by adding a language specifier such as "text" or "plaintext" (e.g., change the opening backticks to ```text) so the linter stops flagging MD040.
265-270:⚠️ Potential issue | 🟡 MinorAdd language identifier to the fenced code block.
This fenced block is missing a language specifier, triggering markdownlint MD040. Adding
textorplaintextwill resolve the lint warning.📝 Suggested fix
-``` +```text Config | Variant | TPS | vs base | P50ms | vs base | P95ms | vs base | Peak MempoolAs per coding guidelines, this was flagged in a previous review but remains unaddressed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 265 - 270, The README's fenced code block that starts with "Config | Variant | TPS ..." is missing a language identifier, causing markdownlint MD040; edit the fenced block in integration-tests/e2e/benchmark/README.md and add a language specifier such as "text" or "plaintext" immediately after the opening backticks (e.g., ```text) so the table block is recognized and the lint warning is resolved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 302-311: The fenced directory listing in the README.md is missing
a language identifier causing markdownlint MD040; update the fenced code block
around the directory tree (the block that starts with ``` and contains
"benchmark/ config.go load.go ...") by adding a language specifier such as
"text" or "plaintext" (e.g., change the opening backticks to ```text) so the
linter stops flagging MD040.
- Around line 265-270: The README's fenced code block that starts with "Config
| Variant | TPS ..." is missing a language identifier, causing
markdownlint MD040; edit the fenced block in
integration-tests/e2e/benchmark/README.md and add a language specifier such as
"text" or "plaintext" immediately after the opening backticks (e.g., ```text) so
the table block is recognized and the lint warning is resolved.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 11afa1ce-7519-43f0-a000-9e61827b0f48
📒 Files selected for processing (3)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/config.go
✅ Files skipped from review due to trivial changes (1)
- integration-tests/e2e/benchmark/benchmark_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- integration-tests/e2e/benchmark/config.go
Description
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!in the type prefix if API or client breaking changeReviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
Summary by CodeRabbit
Tests
New Features
Documentation
Chores