Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
📝 WalkthroughWalkthroughAdds a full end-to-end benchmarking suite: Makefile target, README, test harness, configs, load generators, collectors, reporters, Move benchmark sources, and cluster API extensions for measuring mempool, block, throughput, and latency across benchmark variants. Changes
Sequence DiagramsequenceDiagram
participant Test as Test
participant Cluster as Cluster
participant Load as LoadGenerator
participant Poller as MempoolPoller
participant Collector as Collector
participant Reporter as Reporter
Test->>Cluster: Setup cluster with BenchConfig
Test->>Cluster: CollectInitialMetas
Test->>Test: Warmup
Test->>Cluster: Record start height
Test->>Poller: Start mempool polling
Test->>Load: Execute load (Burst/Sequential/PreSigned/MoveExec)
Load->>Cluster: Submit transactions
Load-->>Test: Return LoadResult
Test->>Cluster: WaitForAllIncluded (drain + finality)
Test->>Poller: Stop (get peak)
Test->>Collector: CollectResults (query blocks, compute latencies)
Collector->>Cluster: QueryBlock / LatestHeight
Collector-->>Test: Return BenchResult
Test->>Reporter: WriteResult / PrintComparisonTable / PrintImprovementTable
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (5)
Makefile (1)
247-249: Parameterize the benchmark test filter and keep module mode read-only.This target is useful as-is, but adding
BENCH_RUNand-mod=readonlymakes it safer and easier to run focused suites in CI/dev workflows.♻️ Proposed improvement
+BENCH_RUN ?= TestBenchmark + benchmark-e2e: - cd integration-tests/e2e && go test -v -tags benchmark -run TestBenchmark -timeout 30m -count=1 ./benchmark/ + cd integration-tests/e2e && go test -mod=readonly -v -tags benchmark -run '$(BENCH_RUN)' -timeout 30m -count=1 ./benchmark/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 247 - 249, Update the benchmark-e2e Makefile target to accept a BENCH_RUN filter and run Go in module readonly mode: change the command invoked by the benchmark-e2e target (currently invoking go test -v -tags benchmark -run TestBenchmark -timeout 30m -count=1 ./benchmark/) to use the BENCH_RUN variable (e.g., -run "$(BENCH_RUN:-TestBenchmark)") and add the -mod=readonly flag so the go test invocation becomes go test -mod=readonly -v -tags benchmark -run "$(BENCH_RUN)" -timeout 30m -count=1 ./benchmark/; ensure the target name benchmark-e2e and variable BENCH_RUN are used exactly as referenced.integration-tests/e2e/benchmark/collector.go (1)
199-199: Make the finality wait context-aware.This fixed sleep ignores cancellation and can keep tests waiting after timeout/cancel paths.
♻️ Proposed improvement
- // Wait one more block to ensure finality - time.Sleep(3 * time.Second) + // Wait one more block to ensure finality + select { + case <-ctx.Done(): + return 0, ctx.Err() + case <-time.After(3 * time.Second): + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/collector.go` at line 199, Replace the fixed sleep with a context-aware wait: instead of time.Sleep(3 * time.Second) use a select that waits on time.After(3*time.Second) and ctx.Done() (where ctx is the request/test context in scope) so the routine aborts promptly when cancelled; ensure the surrounding function (e.g., the finality wait routine) returns or propagates ctx.Err() on cancellation rather than blocking until the sleep completes.integration-tests/e2e/cluster/cluster.go (1)
294-303: Add a request-level timeout for block RPC calls.With benchmark flows using long-lived contexts, a stalled RPC can block result collection indefinitely.
♻️ Proposed improvement
url := fmt.Sprintf("http://127.0.0.1:%d/block?height=%d", n.Ports.RPC, height) - req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) + reqCtx, cancel := context.WithTimeout(ctx, 10*time.Second) + defer cancel() + req, err := http.NewRequestWithContext(reqCtx, http.MethodGet, url, nil) if err != nil { return BlockResult{}, err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 294 - 303, The block RPC call currently uses the long-lived ctx and can hang; wrap the request in a short request-level timeout by creating a derived context, e.g. ctxWithTimeout, cancel := context.WithTimeout(ctx, rpcRequestTimeout) (choose a sensible duration constant like rpcRequestTimeout), defer cancel(), and use ctxWithTimeout in http.NewRequestWithContext instead of ctx; ensure you reference url, req creation, and the http.DefaultClient.Do(req) call so the request will fail fast on timeout rather than blocking result collection.integration-tests/e2e/benchmark/report.go (1)
18-20: Avoid overwriting previous benchmark runs with identical labels.Using label-only filenames drops historical runs and makes trend comparisons harder.
♻️ Proposed improvement
import ( "encoding/json" "fmt" "os" "path/filepath" "strings" "testing" + "time" ) @@ safe := strings.NewReplacer("/", "_", " ", "_", "+", "_").Replace(result.Config.Label) - path := filepath.Join(dir, safe+".json") + stamp := time.Now().UTC().Format("20060102T150405Z") + path := filepath.Join(dir, fmt.Sprintf("%s_%s.json", safe, stamp))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 18 - 20, The current filename uses only result.Config.Label (via safe := strings.NewReplacer(...).Replace(result.Config.Label) and path := filepath.Join(dir, safe+".json")), which overwrites previous runs with the same label; change the filename to include a unique suffix (e.g., a timestamp from time.Now().UTC().Format with seconds or milliseconds, or a run ID/UUID) so each run produces a distinct file (e.g., combine safe + "_" + timestamp + ".json") and update the code that reads/writes using the new path variable accordingly.integration-tests/e2e/benchmark/load.go (1)
185-197: Duplicate slice initialisation insequencePattern.The
[]uint64{base+2, base, base+1}literal is initialised twice — once for thecount <= 3branch and once before thecount > 3loop. Unifying them removes the duplication.♻️ Proposed refactor
func sequencePattern(base uint64, count int) []uint64 { - if count <= 3 { - seqs := []uint64{base + 2, base, base + 1} - return seqs[:count] - } - seqs := []uint64{base + 2, base, base + 1} + if count <= 3 { + return seqs[:count] + } for i := 3; i < count; i++ { seqs = append(seqs, base+uint64(i)) } - return seqs }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/load.go` around lines 185 - 197, The function sequencePattern duplicates the literal []uint64{base + 2, base, base + 1} in both branches; to fix, initialize seqs once (seqs := []uint64{base + 2, base, base + 1}), then if count <= 3 return seqs[:count], otherwise run the existing for loop (for i := 3; i < count; i++ { seqs = append(seqs, base+uint64(i)) }) and return seqs; this removes the duplicated initialization while preserving behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 253-256: The benchmark creates a MempoolPoller per node but each
MempoolPoller already computes the max across all nodes, causing duplicated
metrics and the saved result uses a hardcoded peak_mempool_size=0; fix by
instantiating a single MempoolPoller via NewMempoolPoller(ctx, cluster,
mempoolPollInterval) (reuse that one for the whole test instead of creating one
per node), and when persisting results read the actual max value from the poller
(e.g., poller.Max or the appropriate accessor on MempoolPoller) into
peak_mempool_size rather than 0; apply the same change for the analogous block
around the other occurrence (lines 266-271).
In `@integration-tests/e2e/benchmark/collector.go`:
- Around line 157-164: The throughput window currently uses the first and last
entries of blockStats which includes empty blocks; instead scan blockStats to
find the first and last entries that actually contain benchmark transactions
(e.g., where a transaction count field such as TxCount > 0 or a Txs slice length
> 0), assign their Time values to first/last, compute blockDuration =
last.Sub(first).Seconds(), and only set totalDuration when blockDuration > 0;
update the code around blockStats, first, last, blockDuration and totalDuration
accordingly so the TPS window excludes empty blocks.
In `@integration-tests/e2e/benchmark/load.go`:
- Line 48: Check and guard cfg.NodeCount before performing the modulo in both
BurstLoad and OutOfOrderLoad: if cfg.NodeCount is zero (or otherwise invalid)
return an error or set a safe default before entering the loop so the expression
i % cfg.NodeCount cannot panic; update the entry checks in BurstLoad and add the
same pre-loop guard to OutOfOrderLoad (the code computing viaNode using i %
cfg.NodeCount) to ensure safe execution.
- Around line 13-19: TxSubmission currently lacks an application-level response
code and non-zero res.Code checks are ignored; add a Code int (or uint32) field
to the TxSubmission struct and populate it wherever responses are processed
(e.g., in BurstLoad where res.Code is checked) so callers can distinguish
success vs app-level failures; update the response handling logic in BurstLoad
to assign res.Code into the TxSubmission entries and apply the same change to
OutOfOrderLoad and SingleNodeLoad so all load paths record the code
consistently.
- Line 135: SingleNodeLoad currently takes targetNode int and passes it straight
to SendBankTxWithSequence without bounds checking; add an entry validation in
SingleNodeLoad to ensure 0 <= targetNode < cfg.NodeCount and handle invalid
values by returning an early LoadResult/error (or logging and aborting the load)
instead of calling SendBankTxWithSequence; apply the same validation to the
other similar usage site referenced in the diff so no out-of-range targetNode
can reach SendBankTxWithSequence and corrupt the benchmark.
- Around line 44-73: The per-account goroutine loops in BurstLoad (and similarly
in OutOfOrderLoad and SingleNodeLoad) do not observe ctx cancellation and will
continue dispatching txs after the context is done; update each loop that
iterates up to cfg.TxPerAccount (the goroutine that calls
cluster.SendBankTxWithSequence and appends to result.Submissions) to check
ctx.Done() at the top of each iteration (e.g., use a non-blocking select on
ctx.Done() or check ctx.Err()) and break/return if cancelled so the goroutine
stops sending work and allows wg.Wait() to complete promptly.
- Around line 185-197: sequencePattern currently emits gaps when count < 3 (e.g.
returns [base+2] for count==1), which leaves some sequence numbers never
submitted and transactions stuck; fix by changing sequencePattern to return a
contiguous sequence for small counts (for count <= 3 or specifically count < 3)
— i.e., when count < 3 return []uint64{base, base+1, ...} truncated to count,
otherwise keep the existing out-of-order logic; alternatively (or in addition)
validate TxPerAccount in OutOfOrderLoad and enforce a minimum of 3 before
calling sequencePattern to avoid calling it with invalid counts. Ensure you
update the function sequencePattern and/or the caller
OutOfOrderLoad/TxPerAccount validation accordingly.
- Around line 200-212: The Warmup function currently ignores the result of
SendBankTxWithSequence and doesn't update the metas map, which hides
transport/app errors and leaves stale on-chain sequence numbers; modify Warmup
to capture and check the return/error from cluster.SendBankTxWithSequence, log
any failures (including context like account name and node), and on success
increment the corresponding metas[name].Sequence (or refresh metas[name] from
chain) so subsequent calls use the updated sequence; also add a brief comment in
Warmup documenting the sequence-side effect and that callers must use the
updated metas or re-fetch account metadata.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 87-92: Update the README to use one consistent default results
directory everywhere: replace occurrences of `results/` with
`benchmark/results/` (or vice versa based on project convention) so all
references (e.g., the "Results land in `benchmark/results/` as JSON." line and
the "After baseline results exist in `results/`:" line) point to the same path,
and run a quick grep to ensure no remaining mismatched occurrences remain in the
document.
- Around line 107-112: Add language identifiers (e.g., "text") to the fenced
code blocks in the README so Markdown lint (MD040) stops warning: update each
triple-backtick block that contains the table or code samples (the backtick
fences around the tabular benchmark output and the code snippets under the
"benchmark/" section) to start with ```text instead of ```; ensure all
occurrences noted in the comment (the table blocks and the code listing blocks)
are changed consistently.
- Around line 79-81: The E2E_INITIAD_BIN environment variable is currently
applied only to the cd command so go test may not see it; change how the env var
is set so go test receives it (either prefix the go test invocation with
E2E_INITIAD_BIN=./build/initiad or export E2E_INITIAD_BIN before running cd and
go test) to ensure the TestBenchmarkBaseline run uses the intended binary
(reference the E2E_INITIAD_BIN variable and the go test -run
TestBenchmarkBaseline invocation).
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 332-336: The loop over decoded.Result.Block.Data.Txs currently
ignores base64 decode failures (using continue) which silently under-reports
txs; replace the silent continue with a fail-fast error return that includes
block context and tx index: when base64Decode(txBase64) returns decErr,
construct and return an error (or propagate decErr) that includes
decoded.Result.Block.Header.Height (or block identifier from decoded), the tx
index within the loop, and the original decErr so callers can surface the
failure and metrics remain correct; update any surrounding function signature to
return an error if needed (identify code at the loop over
decoded.Result.Block.Data.Txs and the base64Decode call).
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/collector.go`:
- Line 199: Replace the fixed sleep with a context-aware wait: instead of
time.Sleep(3 * time.Second) use a select that waits on time.After(3*time.Second)
and ctx.Done() (where ctx is the request/test context in scope) so the routine
aborts promptly when cancelled; ensure the surrounding function (e.g., the
finality wait routine) returns or propagates ctx.Err() on cancellation rather
than blocking until the sleep completes.
In `@integration-tests/e2e/benchmark/load.go`:
- Around line 185-197: The function sequencePattern duplicates the literal
[]uint64{base + 2, base, base + 1} in both branches; to fix, initialize seqs
once (seqs := []uint64{base + 2, base, base + 1}), then if count <= 3 return
seqs[:count], otherwise run the existing for loop (for i := 3; i < count; i++ {
seqs = append(seqs, base+uint64(i)) }) and return seqs; this removes the
duplicated initialization while preserving behavior.
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 18-20: The current filename uses only result.Config.Label (via
safe := strings.NewReplacer(...).Replace(result.Config.Label) and path :=
filepath.Join(dir, safe+".json")), which overwrites previous runs with the same
label; change the filename to include a unique suffix (e.g., a timestamp from
time.Now().UTC().Format with seconds or milliseconds, or a run ID/UUID) so each
run produces a distinct file (e.g., combine safe + "_" + timestamp + ".json")
and update the code that reads/writes using the new path variable accordingly.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 294-303: The block RPC call currently uses the long-lived ctx and
can hang; wrap the request in a short request-level timeout by creating a
derived context, e.g. ctxWithTimeout, cancel := context.WithTimeout(ctx,
rpcRequestTimeout) (choose a sensible duration constant like rpcRequestTimeout),
defer cancel(), and use ctxWithTimeout in http.NewRequestWithContext instead of
ctx; ensure you reference url, req creation, and the http.DefaultClient.Do(req)
call so the request will fail fast on timeout rather than blocking result
collection.
In `@Makefile`:
- Around line 247-249: Update the benchmark-e2e Makefile target to accept a
BENCH_RUN filter and run Go in module readonly mode: change the command invoked
by the benchmark-e2e target (currently invoking go test -v -tags benchmark -run
TestBenchmark -timeout 30m -count=1 ./benchmark/) to use the BENCH_RUN variable
(e.g., -run "$(BENCH_RUN:-TestBenchmark)") and add the -mod=readonly flag so the
go test invocation becomes go test -mod=readonly -v -tags benchmark -run
"$(BENCH_RUN)" -timeout 30m -count=1 ./benchmark/; ensure the target name
benchmark-e2e and variable BENCH_RUN are used exactly as referenced.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (9)
Makefileintegration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/collector.gointegration-tests/e2e/benchmark/config.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster.gointegration-tests/e2e/cluster/cluster.go
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
integration-tests/e2e/benchmark/README.md (2)
37-38:⚠️ Potential issue | 🟡 MinorUse one consistent default results directory path throughout the doc.
results/andbenchmark/results/are both documented as defaults, which is ambiguous for users and CI scripts.✏️ Suggested cleanup
-- The JSON results persist in `results/`. Subsequent `TestBenchmarkFullComparison` runs load them automatically. +- The JSON results persist in `benchmark/results/`. Subsequent `TestBenchmarkFullComparison` runs load them automatically. @@ -| `BENCHMARK_RESULTS_DIR` | `results/` | Output directory for JSON results | +| `BENCHMARK_RESULTS_DIR` | `benchmark/results/` | Output directory for JSON results |Also applies to: 87-92, 189-189
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 37 - 38, The README documents two different default results directories ("results/" and "benchmark/results/"), causing ambiguity for users and CI; pick one canonical default path and update all mentions to it (including the sentence referencing TestBenchmarkFullComparison and the other occurrences noted around lines 87-92 and 189) so the doc consistently uses the same directory throughout; ensure any examples, scripts, and the TestBenchmarkFullComparison description reflect that single chosen path.
107-112:⚠️ Potential issue | 🟡 MinorAdd language identifiers to fenced blocks to satisfy MD040.
These blocks should use
```text(or another explicit language) to clear markdownlint warnings.✏️ Suggested cleanup
-``` +```text Config | Variant | TPS | vs base | P50ms | vs base | P95ms | vs base | P99ms | vs base | Peak Mempool ...-
+text
Config | Variant | TPS | P50ms | P95ms | P99ms | Max ms | Peak Mempool
...-``` +```text benchmark/ config.go Variant definitions, BenchConfig, preset constructors ...</details> Also applies to: 130-134, 155-163 <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@integration-tests/e2e/benchmark/README.mdaround lines 107 - 112, The
markdown fenced code blocks triggering MD040 should include an explicit language
identifier (e.g., usetext) instead of bare; update the shown table block
and the other fenced blocks (the blocks around the snippet at lines 130-134 and
155-163) by changing their opening fences fromtotext (or another
appropriate language) so markdownlint MD040 no longer warns and the content
renders correctly.</details> </blockquote></details> <details> <summary>integration-tests/e2e/benchmark/collector.go (1)</summary><blockquote> `131-137`: _⚠️ Potential issue_ | _🟠 Major_ **TPS window is still derived from block non-emptiness, not benchmark tx inclusion.** At Line 133/160, `TxCount` tracks all txs in the block, so duration can still include blocks with no benchmark txs. The window should use first/last block times where matched benchmark txs were actually found. <details> <summary>🐛 Proposed fix</summary> ```diff var ( blockStats []BlockStat latencies []float64 included int + firstIncludedTime time.Time + lastIncludedTime time.Time ) @@ - for _, txHash := range block.TxHashes { + includedInBlock := 0 + for _, txHash := range block.TxHashes { sub, ok := submitMap[txHash] if !ok { continue } included++ + includedInBlock++ latencyMs := float64(block.BlockTime.Sub(sub.SubmitTime).Milliseconds()) if latencyMs < 0 { latencyMs = 0 } latencies = append(latencies, latencyMs) } + if includedInBlock > 0 { + if firstIncludedTime.IsZero() { + firstIncludedTime = block.BlockTime + } + lastIncludedTime = block.BlockTime + } } @@ - // use span from first to last block that actually included benchmark txs - var firstTxTime, lastTxTime time.Time - for _, bs := range blockStats { - if bs.TxCount == 0 { - continue - } - if firstTxTime.IsZero() { - firstTxTime = bs.Time - } - lastTxTime = bs.Time - } - if !firstTxTime.IsZero() { - blockDuration := lastTxTime.Sub(firstTxTime).Seconds() + if !firstIncludedTime.IsZero() { + blockDuration := lastIncludedTime.Sub(firstIncludedTime).Seconds() if blockDuration > 0 { totalDuration = blockDuration } } ``` </details> Also applies to: 157-173 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/collector.go` around lines 131 - 137, BlockStat currently uses TxCount: len(block.TxHashes) and block times which causes the TPS window to include blocks that contain no benchmark transactions; update the BlockStat population logic (where blockStats is appended) to count only matched benchmark transactions (e.g., filter block.TxHashes by the benchmark tx set and set TxCount to that filtered length) and also record the timestamp of the first/last matched benchmark tx (or a flag/field indicating presence) so that the overall window computation (the later logic around computing start/end times and TPS between lines ~157-173) uses the timestamps of the first and last blocks that actually contained benchmark txs rather than any non-empty block times. Ensure references to BlockStat, TxCount, blockStats and the window computation code are updated accordingly. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@integration-tests/e2e/benchmark/collector.go:
- Around line 48-64: NewMempoolPoller may create a time.Ticker with a
non-positive interval which panics; in the constructor (NewMempoolPoller) ensure
p.interval is a positive duration before starting the goroutine—if the provided
interval is <= 0, replace it with a sensible default (e.g., time.Second or
another minimum > 0) so that MempoolPoller.run can safely create a
time.NewTicker(p.interval) without panicking.- Around line 206-208: The post-mempool wait currently uses a blocking
time.Sleep(3 * time.Second) which ignores ctx cancellation; replace it with a
context-aware timer: create a time.NewTimer(3*time.Second) and use select to
wait on either timer.C or ctx.Done(), stop the timer when appropriate, and if
ctx.Done() fires propagate/return the context error instead of proceeding to
call cluster.LatestHeight(ctx, 0).In
@integration-tests/e2e/cluster/cluster.go:
- Around line 886-890: The code only sets memiavl.enable to "true" when
c.opts.MemIAVL is true, leaving the false case to upstream defaults; update the
logic around the conditional that calls setTOMLValue(appPath, "memiavl",
"enable", ...) so that it always writes an explicit value: write "true" when
c.opts.MemIAVL is true and write "false" when it's false (i.e., call
setTOMLValue with "false" in the else branch), ensuring the memiavl.enable key
is deterministically set regardless of upstream config.
Duplicate comments:
In@integration-tests/e2e/benchmark/collector.go:
- Around line 131-137: BlockStat currently uses TxCount: len(block.TxHashes) and
block times which causes the TPS window to include blocks that contain no
benchmark transactions; update the BlockStat population logic (where blockStats
is appended) to count only matched benchmark transactions (e.g., filter
block.TxHashes by the benchmark tx set and set TxCount to that filtered length)
and also record the timestamp of the first/last matched benchmark tx (or a
flag/field indicating presence) so that the overall window computation (the
later logic around computing start/end times and TPS between lines ~157-173)
uses the timestamps of the first and last blocks that actually contained
benchmark txs rather than any non-empty block times. Ensure references to
BlockStat, TxCount, blockStats and the window computation code are updated
accordingly.In
@integration-tests/e2e/benchmark/README.md:
- Around line 37-38: The README documents two different default results
directories ("results/" and "benchmark/results/"), causing ambiguity for users
and CI; pick one canonical default path and update all mentions to it (including
the sentence referencing TestBenchmarkFullComparison and the other occurrences
noted around lines 87-92 and 189) so the doc consistently uses the same
directory throughout; ensure any examples, scripts, and the
TestBenchmarkFullComparison description reflect that single chosen path.- Around line 107-112: The markdown fenced code blocks triggering MD040 should
include an explicit language identifier (e.g., usetext) instead of bare;
update the shown table block and the other fenced blocks (the blocks around the
snippet at lines 130-134 and 155-163) by changing their opening fences fromtotext (or another appropriate language) so markdownlint MD040 no longer
warns and the content renders correctly.</details> --- <details> <summary>ℹ️ Review info</summary> **Configuration used**: Repository UI **Review profile**: CHILL **Plan**: Pro **Cache: Disabled due to data retention organization setting** **Knowledge base: Disabled due to data retention organization setting** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between de2baaf1144434fe86ea58dc43a0d066cde3821c and e9fab6663dd0532eb038f3353c191049f39ae171. </details> <details> <summary>📒 Files selected for processing (5)</summary> * `integration-tests/e2e/benchmark/README.md` * `integration-tests/e2e/benchmark/benchmark_test.go` * `integration-tests/e2e/benchmark/collector.go` * `integration-tests/e2e/benchmark/load.go` * `integration-tests/e2e/cluster/cluster.go` </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (2)</summary> * integration-tests/e2e/benchmark/benchmark_test.go * integration-tests/e2e/benchmark/load.go </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (2)
886-892: Past review comment addressed correctly.
memiavl.enableis now always written explicitly, eliminating dependence on upstream defaults for both variants.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 886 - 892, The code now always sets memiavl.enable explicitly; ensure the call to setTOMLValue writes the string value derived from c.opts.MemIAVL by keeping memIAVLEnable and the if block that sets it based on c.opts.MemIAVL, then pass memIAVLEnable to setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable) so memiavl.enable is unambiguous regardless of upstream defaults (verify in the function where memIAVLEnable, c.opts.MemIAVL, appPath and setTOMLValue are used).
332-336: Past review comment addressed correctly.The fail-fast decode error with
heightandidxcontext directly addresses the previous concern about silent tx under-reporting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 332 - 336, The loop already fails fast on decode errors and includes contextual info; no change needed—keep the current base64Decode handling inside the for loop that returns fmt.Errorf("failed to decode tx at height=%d index=%d: %w", height, idx, decErr) so failures surface with height/idx context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 886-892: The code now always sets memiavl.enable explicitly;
ensure the call to setTOMLValue writes the string value derived from
c.opts.MemIAVL by keeping memIAVLEnable and the if block that sets it based on
c.opts.MemIAVL, then pass memIAVLEnable to setTOMLValue(appPath, "memiavl",
"enable", memIAVLEnable) so memiavl.enable is unambiguous regardless of upstream
defaults (verify in the function where memIAVLEnable, c.opts.MemIAVL, appPath
and setTOMLValue are used).
- Around line 332-336: The loop already fails fast on decode errors and includes
contextual info; no change needed—keep the current base64Decode handling inside
the for loop that returns fmt.Errorf("failed to decode tx at height=%d index=%d:
%w", height, idx, decErr) so failures surface with height/idx context.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
integration-tests/e2e/benchmark/collector.gointegration-tests/e2e/cluster/cluster.go
🚧 Files skipped from review as they are similar to previous changes (1)
- integration-tests/e2e/benchmark/collector.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (1)
886-890:⚠️ Potential issue | 🟠 MajorExplicitly set
memiavl.enablefor both true and false paths.At Line 886, only the
truepath is written. Whenc.opts.MemIAVLis false, config falls back to upstream defaults, which can skew baseline vs. variant comparisons.🔧 Proposed fix
- if c.opts.MemIAVL { - if err := setTOMLValue(appPath, "memiavl", "enable", "true"); err != nil { - return err - } - } + memIAVLEnable := "false" + if c.opts.MemIAVL { + memIAVLEnable = "true" + } + if err := setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable); err != nil { + return err + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 886 - 890, The code only sets memiavl.enable when c.opts.MemIAVL is true, causing defaults to leak when false; update the conditional around c.opts.MemIAVL to explicitly call setTOMLValue(appPath, "memiavl", "enable", "true") when true and call setTOMLValue(appPath, "memiavl", "enable", "false") when false (preserve error handling/returns), referencing c.opts.MemIAVL, setTOMLValue and appPath so the configuration is deterministic for both branches.integration-tests/e2e/benchmark/collector.go (1)
160-176:⚠️ Potential issue | 🟠 MajorTPS window is still keyed off total block txs, not benchmark-matched txs.
Line 163 checks
bs.TxCount, which includes non-benchmark traffic. That can shift first/last timestamps and distort TPS for this benchmark run.🔧 Proposed fix
- // use span from first to last block that actually included benchmark txs - var firstTxTime, lastTxTime time.Time - for _, bs := range blockStats { - if bs.TxCount == 0 { - continue - } - if firstTxTime.IsZero() { - firstTxTime = bs.Time - } - lastTxTime = bs.Time - } + // use span from first to last block that included matched benchmark txs + var firstTxTime, lastTxTime time.Time + for h := startHeight; h <= endHeight; h++ { + block, err := cluster.QueryBlock(ctx, 0, h) + if err != nil { + return BenchResult{}, fmt.Errorf("query block %d: %w", h, err) + } + + matched := 0 + for _, txHash := range block.TxHashes { + if _, ok := submitMap[txHash]; ok { + matched++ + } + } + if matched == 0 { + continue + } + if firstTxTime.IsZero() { + firstTxTime = block.BlockTime + } + lastTxTime = block.BlockTime + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/collector.go` around lines 160 - 176, The TPS window selection currently uses bs.TxCount (which includes non-benchmark traffic) to find firstTxTime/lastTxTime; change the check to use the count of benchmark-matched transactions for each block instead (e.g., compute matchedCount from a field like bs.MatchedTxCount or by filtering bs.TxResults for the benchmark marker/flag), skip blocks when matchedCount == 0, and use those matched blocks to set firstTxTime/lastTxTime and compute blockDuration/totalDuration (update the condition that referenced bs.TxCount and any related logic in the loop that sets firstTxTime, lastTxTime, blockDuration and totalDuration).
🧹 Nitpick comments (1)
integration-tests/e2e/benchmark/report.go (1)
118-137: Consider sorting loaded results for deterministic reporting output.
os.ReadDirordering is filesystem-dependent; sortingresults(or filenames) before returning will keep table ordering stable across environments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 118 - 137, The code relies on os.ReadDir(dir) ordering which is filesystem-dependent; make the output deterministic by sorting the entries slice by entry.Name() right after calling os.ReadDir (before the for _, entry := range entries loop) using sort.Slice or sort.SliceStable, so the subsequent reads/unmarshals produce a stable results order (refer to the entries variable and the loop that reads files into results).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 88-92: The check uses only r.Config.Variant so it hides deltas for
any row sharing that Variant; instead detect the actual baseline row itself.
Replace the condition at the block that sets tpsDelta/p50Delta/p95Delta so it
compares the selected row to the baseline (e.g., r == baseline) or compares the
full config identity (e.g., reflect.DeepEqual(r.Config, baseline.Config) or
compare a unique ID field) rather than r.Config.Variant; keep the rest of the
delta assignment logic unchanged and only set "-" when the row equals the
baseline.
---
Duplicate comments:
In `@integration-tests/e2e/benchmark/collector.go`:
- Around line 160-176: The TPS window selection currently uses bs.TxCount (which
includes non-benchmark traffic) to find firstTxTime/lastTxTime; change the check
to use the count of benchmark-matched transactions for each block instead (e.g.,
compute matchedCount from a field like bs.MatchedTxCount or by filtering
bs.TxResults for the benchmark marker/flag), skip blocks when matchedCount == 0,
and use those matched blocks to set firstTxTime/lastTxTime and compute
blockDuration/totalDuration (update the condition that referenced bs.TxCount and
any related logic in the loop that sets firstTxTime, lastTxTime, blockDuration
and totalDuration).
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 886-890: The code only sets memiavl.enable when c.opts.MemIAVL is
true, causing defaults to leak when false; update the conditional around
c.opts.MemIAVL to explicitly call setTOMLValue(appPath, "memiavl", "enable",
"true") when true and call setTOMLValue(appPath, "memiavl", "enable", "false")
when false (preserve error handling/returns), referencing c.opts.MemIAVL,
setTOMLValue and appPath so the configuration is deterministic for both
branches.
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 118-137: The code relies on os.ReadDir(dir) ordering which is
filesystem-dependent; make the output deterministic by sorting the entries slice
by entry.Name() right after calling os.ReadDir (before the for _, entry := range
entries loop) using sort.Slice or sort.SliceStable, so the subsequent
reads/unmarshals produce a stable results order (refer to the entries variable
and the loop that reads files into results).
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (4)
integration-tests/e2e/benchmark/collector.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster/cluster.go
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #481 +/- ##
=======================================
Coverage 37.99% 37.99%
=======================================
Files 325 325
Lines 30762 30762
=======================================
Hits 11689 11689
Misses 17189 17189
Partials 1884 1884 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (1)
938-942:⚠️ Potential issue | 🟠 Major
memiavl.enableis still only set whenMemIAVLis true — baseline runs rely on upstream defaults.The code only writes
memiavl.enable=truewhen the flag is set, but never explicitly writesfalse. For benchmark correctness, both variants must be deterministically configured. If a node's defaultapp.tomlships withmemiavl.enable=true, baseline and mempool-only runs would silently use MemIAVL, invalidating comparisons.Proposed fix
- if c.opts.MemIAVL { - if err := setTOMLValue(appPath, "memiavl", "enable", "true"); err != nil { - return err - } + memIAVLEnable := "false" + if c.opts.MemIAVL { + memIAVLEnable = "true" + } + if err := setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable); err != nil { + return err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 938 - 942, The code only sets memiavl.enable when c.opts.MemIAVL is true, so baseline runs may inherit an upstream default; change the block so setTOMLValue(appPath, "memiavl", "enable", ...) is always called with "true" if c.opts.MemIAVL else "false" (i.e., unconditionally write the explicit setting), and propagate/return any error from setTOMLValue; update the branch around c.opts.MemIAVL to call setTOMLValue regardless of its value.integration-tests/e2e/benchmark/report.go (1)
83-92:⚠️ Potential issue | 🟡 MinorBaseline row detection compares
Variant, not identity — suppresses deltas for all rows sharing the baseline variant.If multiple rows have
VariantBaseline(e.g., loaded baseline + a re-run), all of them will show"-"instead of actual deltas. Compare the specific row, not just the variant.Proposed fix
- for _, r := range results { + for i := range results { + r := results[i] tpsDelta := deltaStr(baseline.TxPerSecond, r.TxPerSecond) p50Delta := deltaStr(baseline.P50LatencyMs, r.P50LatencyMs) p95Delta := deltaStr(baseline.P95LatencyMs, r.P95LatencyMs) - if r.Config.Variant == baseline.Config.Variant { + if &results[i] == baseline { tpsDelta = "-" p50Delta = "-" p95Delta = "-" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 83 - 92, The code currently suppresses deltas by comparing r.Config.Variant to baseline.Config.Variant, which hides deltas for any row sharing the same variant; change the baseline detection to compare the specific row identity instead (e.g., compare the result object or its unique run ID) so only the exact baseline row sets tpsDelta/p50Delta/p95Delta = "-". Update the conditional in the loop that references results, baseline, and r.Config.Variant to use r == baseline or r.RunID/ResultID equality (whichever unique identifier exists) to locate the baseline row.
🧹 Nitpick comments (5)
integration-tests/e2e/benchmark/load.go (2)
346-357:sequencePatternstill produces invalid sequences forcount < 3, but the caller guards against it.The guard in
OutOfOrderLoad(line 160-162) ensurescount >= 3, so this is safe in practice. Adding a defensive comment or panic insidesequencePatternitself forcount < 3would prevent future misuse if someone calls it directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/load.go` around lines 346 - 357, The sequencePattern function can return invalid slices when called with count < 3; add a defensive check at the start of sequencePattern to validate count (e.g., panic or return an explicit error/empty slice) and document the requirement in a comment; reference the function name sequencePattern and note the existing caller OutOfOrderLoad enforces count >= 3 but sequencePattern should enforce or clearly document this precondition to avoid future misuse.
30-88: Significant structural duplication across load functions.
BurstLoad,SequentialLoad, andSingleNodeLoadshare ~80% of their body (NodeCount guard, mutex/wg setup, goroutine-per-account with ctx check, submission recording). The only meaningful differences are howviaNodeis computed and whether submissions are sequential or concurrent within an account. Consider extracting a commonrunLoadhelper parameterized by aviaNodeFnand concurrency mode to reduce maintenance burden.Not blocking for this PR — the duplication is manageable at the current scale.
Also applies to: 94-152, 220-277
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/load.go` around lines 30 - 88, BurstLoad, SequentialLoad, and SingleNodeLoad duplicate most logic (NodeCount guard, mutex/wg setup, goroutine-per-account, ctx check, submission recording); refactor by extracting a shared helper runLoad that takes a viaNodeFn (func(i int) int) and a concurrency mode (sequential vs concurrent within an account), then have BurstLoad/SequentialLoad/SingleNodeLoad call runLoad with the appropriate viaNodeFn and mode; ensure runLoad performs the NodeCount check, sets up mu/wg, iterates cluster.AccountNames(), computes seq using meta.Sequence+uint64(i), calls cluster.SendBankTxWithSequence, and appends TxSubmission/Errors to result so callers keep the same behavior and signatures.integration-tests/e2e/cluster/cluster.go (2)
1223-1230: Helpers are fine, butbase64Decodeis a trivial one-liner wrapper.
base64Decodejust callsbase64.StdEncoding.DecodeString. Inlining the call at the one usage site inQueryBlock(line 333) would be simpler. Minor nit — not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 1223 - 1230, The helper base64Decode is a trivial one-liner wrapper around base64.StdEncoding.DecodeString; remove the base64Decode function and replace its single usage in QueryBlock with a direct call to base64.StdEncoding.DecodeString, updating any error handling there accordingly (ensure QueryBlock imports encoding/base64 is available and adjust variable names if needed).
628-678:SendMoveExecuteJSONWithGasis largely duplicated fromMoveExecuteJSONWithSequence.The only difference is that gas estimation is skipped and
gasLimitis passed directly. Consider extracting the shared CLI-invocation logic into a private helper to reduce the ~50 lines of duplication. Not blocking, but worth addressing to keep maintenance manageable as the CLI surface grows.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 628 - 678, SendMoveExecuteJSONWithGas duplicates most of MoveExecuteJSONWithSequence; factor the shared CLI invocation, JSON marshalling, c.exec call, parseTxResultFromOutput and logging into a private helper (e.g., execMoveExecuteJSON or sendMoveExecuteJSONCommon) that accepts parameters for moduleAddress,moduleName,functionName,typeArgs,args,fromName,accountNumber,sequence,gasLimit,viaNode and any flags like "--offline" vs estimation behavior; have both SendMoveExecuteJSONWithGas and MoveExecuteJSONWithSequence call that helper (preserving their public signatures) and only keep the differing gas-handling logic in each caller. Ensure the new helper uses c.exec, parseTxResultFromOutput and the same log format as in the original functions.integration-tests/e2e/benchmark/README.md (1)
103-111: Fenced code block missing language identifier.Static analysis (MD040) flags the code block at line 103. Adding
textas the language identifier would satisfy the linter.Proposed fix
-``` +```text benchmark/ config.go Variant definitions, BenchConfig, preset constructors ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 103 - 111, The fenced code block starting with the benchmark/ listing is missing a language identifier and triggers MD040; update the fence that opens the block (the triple backticks before "benchmark/") to include the language identifier "text" (i.e., change ``` to ```text) so the linter accepts it while preserving the block contents (the section that lists config.go, load.go, collector.go, report.go, benchmark_test.go, results/).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 31-47: setupCluster currently starts the cluster with
cluster.Start() but does not register a cleanup, so if require.NoError(t,
cluster.WaitForReady(...)) fails the test will Goexit and leave node processes
running; fix by calling t.Cleanup(func(){ cluster.Close(context.Background()) })
(or t.Cleanup(func(){ _ = cluster.Close(ctx) }) if context is available)
immediately after successful e2e.NewCluster and before calling
cluster.Start/WaitForReady, ensuring Close is always invoked on test teardown,
and then remove manual defer cluster.Close() calls from runBenchmark,
TestBenchmarkMemIAVLMoveExec, and TestBenchmarkGossipPropagation.
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 12-28: WriteResult can produce a hidden ".json" file when
result.Config.Label is empty; after computing safe :=
strings.NewReplacer(...).Replace(result.Config.Label) add a guard that if safe
== "" you assign a deterministic fallback (e.g. "result-"+timestamp or
"result-"+UnixNano) to avoid collisions, then continue to build path using that
safe value; reference symbols: function WriteResult, variable safe, and
result.Config.Label to locate and update the logic.
---
Duplicate comments:
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 83-92: The code currently suppresses deltas by comparing
r.Config.Variant to baseline.Config.Variant, which hides deltas for any row
sharing the same variant; change the baseline detection to compare the specific
row identity instead (e.g., compare the result object or its unique run ID) so
only the exact baseline row sets tpsDelta/p50Delta/p95Delta = "-". Update the
conditional in the loop that references results, baseline, and r.Config.Variant
to use r == baseline or r.RunID/ResultID equality (whichever unique identifier
exists) to locate the baseline row.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 938-942: The code only sets memiavl.enable when c.opts.MemIAVL is
true, so baseline runs may inherit an upstream default; change the block so
setTOMLValue(appPath, "memiavl", "enable", ...) is always called with "true" if
c.opts.MemIAVL else "false" (i.e., unconditionally write the explicit setting),
and propagate/return any error from setTOMLValue; update the branch around
c.opts.MemIAVL to call setTOMLValue regardless of its value.
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/load.go`:
- Around line 346-357: The sequencePattern function can return invalid slices
when called with count < 3; add a defensive check at the start of
sequencePattern to validate count (e.g., panic or return an explicit error/empty
slice) and document the requirement in a comment; reference the function name
sequencePattern and note the existing caller OutOfOrderLoad enforces count >= 3
but sequencePattern should enforce or clearly document this precondition to
avoid future misuse.
- Around line 30-88: BurstLoad, SequentialLoad, and SingleNodeLoad duplicate
most logic (NodeCount guard, mutex/wg setup, goroutine-per-account, ctx check,
submission recording); refactor by extracting a shared helper runLoad that takes
a viaNodeFn (func(i int) int) and a concurrency mode (sequential vs concurrent
within an account), then have BurstLoad/SequentialLoad/SingleNodeLoad call
runLoad with the appropriate viaNodeFn and mode; ensure runLoad performs the
NodeCount check, sets up mu/wg, iterates cluster.AccountNames(), computes seq
using meta.Sequence+uint64(i), calls cluster.SendBankTxWithSequence, and appends
TxSubmission/Errors to result so callers keep the same behavior and signatures.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 103-111: The fenced code block starting with the benchmark/
listing is missing a language identifier and triggers MD040; update the fence
that opens the block (the triple backticks before "benchmark/") to include the
language identifier "text" (i.e., change ``` to ```text) so the linter accepts
it while preserving the block contents (the section that lists config.go,
load.go, collector.go, report.go, benchmark_test.go, results/).
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 1223-1230: The helper base64Decode is a trivial one-liner wrapper
around base64.StdEncoding.DecodeString; remove the base64Decode function and
replace its single usage in QueryBlock with a direct call to
base64.StdEncoding.DecodeString, updating any error handling there accordingly
(ensure QueryBlock imports encoding/base64 is available and adjust variable
names if needed).
- Around line 628-678: SendMoveExecuteJSONWithGas duplicates most of
MoveExecuteJSONWithSequence; factor the shared CLI invocation, JSON marshalling,
c.exec call, parseTxResultFromOutput and logging into a private helper (e.g.,
execMoveExecuteJSON or sendMoveExecuteJSONCommon) that accepts parameters for
moduleAddress,moduleName,functionName,typeArgs,args,fromName,accountNumber,sequence,gasLimit,viaNode
and any flags like "--offline" vs estimation behavior; have both
SendMoveExecuteJSONWithGas and MoveExecuteJSONWithSequence call that helper
(preserving their public signatures) and only keep the differing gas-handling
logic in each caller. Ensure the new helper uses c.exec, parseTxResultFromOutput
and the same log format as in the original functions.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (5)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster/cluster.go
# Conflicts: # Makefile
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (1)
1172-1176:⚠️ Potential issue | 🟠 MajorSet
memiavl.enableexplicitly for both benchmark variants.Only writing
"true"leaves the"false"case dependent on upstream defaults, which can invalidate baseline vs variant comparisons.💡 Proposed fix
- if c.opts.MemIAVL { - if err := setTOMLValue(appPath, "memiavl", "enable", "true"); err != nil { - return err - } - } + memIAVLEnable := "false" + if c.opts.MemIAVL { + memIAVLEnable = "true" + } + if err := setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable); err != nil { + return err + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 1172 - 1176, The current code only sets memiavl.enable to "true" when c.opts.MemIAVL is set, leaving the "false" case to upstream defaults; update the logic around the setTOMLValue call so that memiavl.enable is explicitly written for both cases (use setTOMLValue(appPath, "memiavl", "enable", "true") when c.opts.MemIAVL is true, and setTOMLValue(appPath, "memiavl", "enable", "false") when it is false) so both benchmark variants have an explicit memiavl.enable value; keep using the existing appPath and setTOMLValue helper.integration-tests/e2e/benchmark/benchmark_test.go (1)
31-49:⚠️ Potential issue | 🟠 MajorRegister cleanup inside
setupClusterto prevent orphaned node processes on setup failure.If
Startsucceeds butWaitForReadyfails,require.NoErrorexits before callers can rundefer cluster.Close().💡 Proposed fix
func setupCluster(t *testing.T, ctx context.Context, cfg BenchConfig) *e2e.Cluster { t.Helper() cluster, err := e2e.NewCluster(ctx, t, e2e.ClusterOptions{ ... }) require.NoError(t, err) + t.Cleanup(cluster.Close) require.NoError(t, cluster.Start(ctx)) require.NoError(t, cluster.WaitForReady(ctx, clusterReadyTimeout)) return cluster }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 31 - 49, After successfully starting the cluster in setupCluster, register a cleanup to always close the cluster so orphaned processes aren't left if WaitForReady (or any later require) fails: after require.NoError(t, cluster.Start(ctx)) call t.Cleanup with a function that calls cluster.Close(ctx) (or cluster.Close and ignore its error) so the cluster is cleaned up even when setupCluster exits early; ensure this registration happens before calling cluster.WaitForReady.
🧹 Nitpick comments (1)
integration-tests/e2e/cluster/cluster.go (1)
366-367: Return a copy fromValidatorAddressesto protect internal state.Returning the backing slice allows external mutation of cluster internals.
💡 Proposed fix
func (c *Cluster) ValidatorAddresses() []string { - return c.valAddresses + return append([]string(nil), c.valAddresses...) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 366 - 367, The ValidatorAddresses method currently returns the backing slice c.valAddresses allowing callers to mutate internal state; change Cluster.ValidatorAddresses to return a defensive copy by allocating a new slice of the same length, copying c.valAddresses into it (using copy) and returning that new slice so the original c.valAddresses remains unmodified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/load.go`:
- Around line 612-621: The pre-sign loop that calls GenerateSignedBankTx keeps
running after context cancellation; modify the loop(s) (the one iterating i <
cfg.TxPerAccount and the similar loop at the other location) to check ctx.Done()
each iteration and break/return immediately when the context is canceled (e.g.,
use select { case <-ctx.Done(): break/return; default: } before calling
GenerateSignedBankTx) so no further work or logging happens after cancellation.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 107-113: The table is missing the "Included" column referenced by
the sentence about tx drops; update the table header and each row in the block
containing "Config | Variant | TPS ..." so it matches other tables (add an
"Included" column between "Peak Mempool" and the rest or where consistent), then
populate the column values (e.g., set the clist/iavl/seq-move-exec row to show
"2705/5000" to support the claim and add corresponding included counts for
proxy+priority and proxy+priority/memiavl rows) so the text "CList only included
2705/5000 txs..." is backed by the table. Ensure column alignment and any header
separators match the existing markdown table format used elsewhere in the
README.
In `@integration-tests/e2e/cluster/cluster.go`:
- Line 1119: The call to os.WriteFile that writes genesis data using
os.WriteFile(genesisPath, out, 0o644) is flagged by lint (G306) for overly
permissive file mode; change the mode argument to 0o600 (i.e.,
os.WriteFile(genesisPath, out, 0o600)) to restrict permissions, or if
intentional, add a clear `#nosec` comment next to the os.WriteFile call to justify
the exception; update the call referencing genesisPath and out accordingly.
- Around line 1102-1113: The patch that sets max_gas silently no-ops if neither
genesis["consensus"] nor genesis["consensus_params"] exist; update the logic in
cluster.go to detect whether max_gas was actually applied and fail loudly if
not: introduce a boolean (e.g., modified) and set it to true when you assign
block["max_gas"] in either the consensus or consensus_params branches, then
after those checks return an error or call the test failure helper (so the test
run fails) if modified is false, referencing the existing genesis map and the
"consensus"/"consensus_params" keys and the block["max_gas"] assignment spots to
locate where to add the flag and the failure path.
---
Duplicate comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 31-49: After successfully starting the cluster in setupCluster,
register a cleanup to always close the cluster so orphaned processes aren't left
if WaitForReady (or any later require) fails: after require.NoError(t,
cluster.Start(ctx)) call t.Cleanup with a function that calls cluster.Close(ctx)
(or cluster.Close and ignore its error) so the cluster is cleaned up even when
setupCluster exits early; ensure this registration happens before calling
cluster.WaitForReady.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 1172-1176: The current code only sets memiavl.enable to "true"
when c.opts.MemIAVL is set, leaving the "false" case to upstream defaults;
update the logic around the setTOMLValue call so that memiavl.enable is
explicitly written for both cases (use setTOMLValue(appPath, "memiavl",
"enable", "true") when c.opts.MemIAVL is true, and setTOMLValue(appPath,
"memiavl", "enable", "false") when it is false) so both benchmark variants have
an explicit memiavl.enable value; keep using the existing appPath and
setTOMLValue helper.
---
Nitpick comments:
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 366-367: The ValidatorAddresses method currently returns the
backing slice c.valAddresses allowing callers to mutate internal state; change
Cluster.ValidatorAddresses to return a defensive copy by allocating a new slice
of the same length, copying c.valAddresses into it (using copy) and returning
that new slice so the original c.valAddresses remains unmodified.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ecc99e73-b5e1-4eb4-93cb-98f0c146a704
⛔ Files ignored due to path filters (1)
integration-tests/e2e/benchmark/move-bench/Move.tomlis excluded by!**/*.toml
📒 Files selected for processing (8)
Makefileintegration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/config.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/move-bench/sources/BenchHeavyState.moveintegration-tests/e2e/cluster.gointegration-tests/e2e/cluster/cluster.go
🚧 Files skipped from review as they are similar to previous changes (1)
- Makefile
There was a problem hiding this comment.
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (2)
1172-1176:⚠️ Potential issue | 🟠 MajorSet
memiavl.enabledeterministically for both variants.At Line 1172, config is only written when enabled. For benchmark parity, write explicit
"false"whenMemIAVLis off.💡 Proposed fix
- if c.opts.MemIAVL { - if err := setTOMLValue(appPath, "memiavl", "enable", "true"); err != nil { - return err - } - } + memIAVLEnable := "false" + if c.opts.MemIAVL { + memIAVLEnable = "true" + } + if err := setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable); err != nil { + return err + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 1172 - 1176, The code only calls setTOMLValue(appPath, "memiavl", "enable", "true") when c.opts.MemIAVL is true, so the TOML key is absent for the false variant; always write the key explicitly by calling setTOMLValue(appPath, "memiavl", "enable", "true") when c.opts.MemIAVL is true and setTOMLValue(appPath, "memiavl", "enable", "false") when c.opts.MemIAVL is false (use the existing c.opts.MemIAVL check to choose which literal to write), ensuring deterministic memiavl.enable across both variants.
1102-1113:⚠️ Potential issue | 🟠 MajorFail when
max_gasis not actually patched.At Line 1102 and Line 1109, both candidate paths are optional; if neither exists, this silently succeeds and benchmarks may run with unexpected block gas settings.
💡 Proposed fix
func patchGenesisBlockGas(genesisPath string, maxGas int64) error { @@ gasStr := strconv.FormatInt(maxGas, 10) + patched := false if cp, ok := genesis["consensus"].(map[string]interface{}); ok { if params, ok := cp["params"].(map[string]interface{}); ok { if block, ok := params["block"].(map[string]interface{}); ok { block["max_gas"] = gasStr + patched = true } } } if cp, ok := genesis["consensus_params"].(map[string]interface{}); ok { if block, ok := cp["block"].(map[string]interface{}); ok { block["max_gas"] = gasStr + patched = true } } + if !patched { + return fmt.Errorf("max_gas path not found in genesis: %s", genesisPath) + } @@ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 1102 - 1113, The current patching logic for genesis["consensus"] and genesis["consensus_params"] may silently do nothing if neither optional path exists; add a check that ensures max_gas was actually set by introducing a boolean (e.g., patched := false) and set it to true inside the branches where block["max_gas"] = gasStr is assigned (referencing cp, params, block, gasStr); after both attempts, if patched is still false return or panic with a clear error indicating neither consensus nor consensus_params contained a block entry to modify so the test fails fast rather than proceeding with unpatched gas.
🧹 Nitpick comments (1)
integration-tests/e2e/cluster/cluster.go (1)
366-368: Return a defensive copy fromValidatorAddresses().At Line 367, returning the internal slice allows callers to mutate cluster state accidentally.
♻️ Proposed refactor
func (c *Cluster) ValidatorAddresses() []string { - return c.valAddresses + out := make([]string, len(c.valAddresses)) + copy(out, c.valAddresses) + return out }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 366 - 368, ValidatorAddresses currently returns the internal slice c.valAddresses which allows external mutation; change ValidatorAddresses() to return a defensive copy of the slice (allocate a new slice of len(c.valAddresses), copy the contents, and return the new slice) so callers cannot modify the Cluster's internal state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 1172-1176: The code only calls setTOMLValue(appPath, "memiavl",
"enable", "true") when c.opts.MemIAVL is true, so the TOML key is absent for the
false variant; always write the key explicitly by calling setTOMLValue(appPath,
"memiavl", "enable", "true") when c.opts.MemIAVL is true and
setTOMLValue(appPath, "memiavl", "enable", "false") when c.opts.MemIAVL is false
(use the existing c.opts.MemIAVL check to choose which literal to write),
ensuring deterministic memiavl.enable across both variants.
- Around line 1102-1113: The current patching logic for genesis["consensus"] and
genesis["consensus_params"] may silently do nothing if neither optional path
exists; add a check that ensures max_gas was actually set by introducing a
boolean (e.g., patched := false) and set it to true inside the branches where
block["max_gas"] = gasStr is assigned (referencing cp, params, block, gasStr);
after both attempts, if patched is still false return or panic with a clear
error indicating neither consensus nor consensus_params contained a block entry
to modify so the test fails fast rather than proceeding with unpatched gas.
---
Nitpick comments:
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 366-368: ValidatorAddresses currently returns the internal slice
c.valAddresses which allows external mutation; change ValidatorAddresses() to
return a defensive copy of the slice (allocate a new slice of
len(c.valAddresses), copy the contents, and return the new slice) so callers
cannot modify the Cluster's internal state.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b64c0786-b9c1-4b8b-ae71-0b0344d29487
📒 Files selected for processing (1)
integration-tests/e2e/cluster/cluster.go
There was a problem hiding this comment.
🧹 Nitpick comments (4)
integration-tests/e2e/benchmark/benchmark_test.go (3)
287-292: Redundantdefer cluster.Close()calls throughout Move exec tests.Similar to
runBenchmark, these tests have redundantdefer cluster.Close()calls sincesetupClusteralready registerst.Cleanup(cluster.Close). This pattern appears in multiple tests:
- Line 288
- Line 317, 332 (TestBenchmarkSeqComparisonMoveExec subtests)
- Line 360, 375 (TestBenchmarkBurstComparisonMoveExec subtests)
- And several others
The redundancy is harmless due to
Close()being idempotent, but removing these would clean up the code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 287 - 292, Remove redundant explicit defer calls to cluster.Close() in Move exec benchmark tests because setupCluster already registers t.Cleanup(cluster.Close). For each test using setupCluster (e.g., where variables and functions like setupCluster, setupMoveExecLoad, moveExecSequential, runBenchmarkWithCluster are used), delete the line defer cluster.Close() and rely on setupCluster's t.Cleanup registration to close the cluster; ensure no other callers require an explicit defer and keep the single cleanup mechanism provided by setupCluster.
209-241: Results accumulation across subtests may be unreliable.The
resultsslice is declared in the parent test and appended to from subtests. In Go'st.Run, subtests run sequentially by default (unlesst.Parallel()is called), so this works correctly here. However, if subtests were ever run in parallel, this would be a data race.The current code is correct, but consider adding a comment noting this dependency:
var results []BenchResult // accumulated sequentially across subtests // load baseline from JSON...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 209 - 241, The results slice in TestBenchmarkSeqComparison is appended to from subtests and relies on sequential execution; add a brief comment above the declaration of results (the var results []BenchResult in TestBenchmarkSeqComparison) stating that results are accumulated sequentially across subtests and that running subtests in parallel would cause a data race (so do not call t.Parallel() in these subtests).
107-115: Redundantdefer cluster.Close()call.The
setupClusterfunction already registerst.Cleanup(cluster.Close), and sinceCluster.Close()is idempotent, thisdeferis safe but unnecessary.func runBenchmark(t *testing.T, cfg BenchConfig, loadFn func(ctx context.Context, cluster *e2e.Cluster, cfg BenchConfig, metas map[string]e2e.AccountMeta) LoadResult) BenchResult { t.Helper() ctx := context.Background() cluster := setupCluster(t, ctx, cfg) - defer cluster.Close() return runBenchmarkWithCluster(t, ctx, cluster, cfg, loadFn) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 107 - 115, The function runBenchmark contains an unnecessary defer cluster.Close() because setupCluster already registers t.Cleanup(cluster.Close); remove the redundant defer from runBenchmark to avoid duplicate cleanup calls—locate the cluster variable in runBenchmark, delete the line "defer cluster.Close()", and leave the call to runBenchmarkWithCluster(t, ctx, cluster, cfg, loadFn) intact; ensure Cluster.Close remains idempotent and no other cleanup logic is removed.integration-tests/e2e/cluster/cluster.go (1)
1754-1760: Consider simplifying the base64 round-trip.The code decodes
txBase64totxBytesthen immediately re-encodes it back to base64 for the request body. SincetxBase64is already base64-encoded, you could use it directly:- txBytes, err := base64.StdEncoding.DecodeString(txBase64) - if err != nil { - return TxResult{}, fmt.Errorf("decode tx base64: %w", err) - } - - reqBody := fmt.Sprintf(`{"jsonrpc":"2.0","id":1,"method":"broadcast_tx_sync","params":{"tx":"%s"}}`, base64.StdEncoding.EncodeToString(txBytes)) + reqBody := fmt.Sprintf(`{"jsonrpc":"2.0","id":1,"method":"broadcast_tx_sync","params":{"tx":"%s"}}`, txBase64)The current approach does provide implicit validation that the input is valid base64, so keeping it is defensible if that validation is desired.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 1754 - 1760, The code decodes txBase64 to txBytes then immediately re-encodes it into the request body (variables txBase64, txBytes, reqBody); simplify by using the original txBase64 directly in the JSON-RPC payload (i.e., insert txBase64 into fmt.Sprintf) or, if you want to retain validation, decode into txBytes but use the original txBase64 for reqBody instead of re-encoding; update error handling accordingly (remove unnecessary decode/encode or keep decode only for validation) and ensure reqBody uses the validated base64 string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 287-292: Remove redundant explicit defer calls to cluster.Close()
in Move exec benchmark tests because setupCluster already registers
t.Cleanup(cluster.Close). For each test using setupCluster (e.g., where
variables and functions like setupCluster, setupMoveExecLoad,
moveExecSequential, runBenchmarkWithCluster are used), delete the line defer
cluster.Close() and rely on setupCluster's t.Cleanup registration to close the
cluster; ensure no other callers require an explicit defer and keep the single
cleanup mechanism provided by setupCluster.
- Around line 209-241: The results slice in TestBenchmarkSeqComparison is
appended to from subtests and relies on sequential execution; add a brief
comment above the declaration of results (the var results []BenchResult in
TestBenchmarkSeqComparison) stating that results are accumulated sequentially
across subtests and that running subtests in parallel would cause a data race
(so do not call t.Parallel() in these subtests).
- Around line 107-115: The function runBenchmark contains an unnecessary defer
cluster.Close() because setupCluster already registers t.Cleanup(cluster.Close);
remove the redundant defer from runBenchmark to avoid duplicate cleanup
calls—locate the cluster variable in runBenchmark, delete the line "defer
cluster.Close()", and leave the call to runBenchmarkWithCluster(t, ctx, cluster,
cfg, loadFn) intact; ensure Cluster.Close remains idempotent and no other
cleanup logic is removed.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 1754-1760: The code decodes txBase64 to txBytes then immediately
re-encodes it into the request body (variables txBase64, txBytes, reqBody);
simplify by using the original txBase64 directly in the JSON-RPC payload (i.e.,
insert txBase64 into fmt.Sprintf) or, if you want to retain validation, decode
into txBytes but use the original txBase64 for reqBody instead of re-encoding;
update error handling accordingly (remove unnecessary decode/encode or keep
decode only for validation) and ensure reqBody uses the validated base64 string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1f40aba7-def7-43ae-ada7-009175aa768c
📒 Files selected for processing (4)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster/cluster.go
Description
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!in the type prefix if API or client breaking changeReviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...