Skip to content

feat(e2e): add automated performance benchmarks with Playwright#17

Merged
gultyayev merged 5 commits intomasterfrom
perf-test
Mar 6, 2026
Merged

feat(e2e): add automated performance benchmarks with Playwright#17
gultyayev merged 5 commits intomasterfrom
perf-test

Conversation

@gultyayev
Copy link
Copy Markdown
Owner

No description provided.

Introduces a Playwright-based performance testing infrastructure using
CDP for metrics collection. Four benchmark scenarios (scroll, drag within
list, drag between lists with autoscroll, dynamic-height scroll) run with
4x CPU throttling, 5 iterations each, and produce aggregated statistics
(mean, median, p95, stddev). Results can be compared against a baseline
to detect regressions above a configurable threshold.
Add --output flag to compare.ts so results can be written to a file.
The CI workflow now writes to $GITHUB_STEP_SUMMARY, making the
comparison table visible directly on the Actions run page.
Add perf:report script that formats raw benchmark metrics as a readable
markdown table. The CI workflow now posts results directly to the PR as
a comment (updated on re-runs) and writes to the GitHub Job Summary.
Removed baseline comparison from CI — the report shows current state.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 6, 2026

Performance Benchmark Results

CPU throttling: 4x · Samples: 5 (1 warmup iteration excluded)

drag-between-lists-autoscroll-1000

Metric Max Mean Stddev
Total Blocking Time 49 ms 38.8 ms ±11.3 ms
Long Tasks (>50ms) 2 1.8 ±0.4
Layouts 38 38 ±0
Style Recalcs 70 69.2 ±0.4
Avg Frame Time 16.8 ms 16.7 ms ±0 ms
Max Frame Gap 27.5 ms 26.3 ms ±0.9 ms
Dropped Frames (>16.7ms) 105 94.4 ±6.3
p99 Frame Time 24 ms 20.6 ms ±1.9 ms
Heap Delta 1302.9 KB 1059.5 KB ±330 KB

drag-within-list-1000

Metric Max Mean Stddev
Total Blocking Time 33 ms 28 ms ±6 ms
Long Tasks (>50ms) 2 1.2 ±0.4
Layouts 5 5 ±0
Style Recalcs 46 44.4 ±0.9
Avg Frame Time 16.9 ms 16.7 ms ±0.3 ms
Max Frame Gap 29.3 ms 26.8 ms ±1.5 ms
Dropped Frames (>16.7ms) 19 16.4 ±1.5
p99 Frame Time 29.3 ms 26.8 ms ±1.5 ms
Heap Delta 422.4 KB 418.9 KB ±2.7 KB

dynamic-height-scroll

Metric Max Mean Stddev
Total Blocking Time 225 ms 225 ms ±0 ms
Long Tasks (>50ms) 3 3 ±0
Layouts 80 77.2 ±1.8
Style Recalcs 177 175.4 ±1.9
Avg Frame Time 16.6 ms 16.6 ms ±0 ms
Max Frame Gap 24.5 ms 22.4 ms ±1.3 ms
Dropped Frames (>16.7ms) 64 59 ±3.4
p99 Frame Time 24.2 ms 22.1 ms ±1.2 ms
Heap Delta 534.5 KB 430.7 KB ±123.3 KB

scroll-2000-items

Metric Max Mean Stddev
Total Blocking Time 185 ms 185 ms ±0 ms
Long Tasks (>50ms) 2 2 ±0
Layouts 59 58.4 ±0.5
Style Recalcs 59 58.4 ±0.5
Avg Frame Time 16.6 ms 16.6 ms ±0 ms
Max Frame Gap 25.3 ms 23.9 ms ±1.8 ms
Dropped Frames (>16.7ms) 61 60.2 ±0.4
p99 Frame Time 21.9 ms 21.6 ms ±0.4 ms
Heap Delta 4779.6 KB 3830.5 KB ±1449.3 KB

Performance Comparison (threshold: 25%)

Scenario Metric Baseline p95 Current p95 Change
drag-between-lists-autoscroll-1000 totalBlockingTime 61.0 49.0 -19.7%
drag-between-lists-autoscroll-1000 longTaskCount 2.0 2.0 +0.0%
drag-between-lists-autoscroll-1000 layoutCount 38.0 38.0 +0.0%
drag-between-lists-autoscroll-1000 recalcStyleCount 71.0 70.0 -1.4%
drag-between-lists-autoscroll-1000 avgFrameTime 16.8 16.8 -0.0%
drag-between-lists-autoscroll-1000 maxFrameGap 29.8 27.5 -7.7%
drag-between-lists-autoscroll-1000 droppedFrames 107.0 105.0 -1.9%
drag-between-lists-autoscroll-1000 p99FrameTime 22.3 24.0 +7.6%
drag-between-lists-autoscroll-1000 jsHeapDelta 1302.8 1302.9 +0.0%
drag-within-list-1000 totalBlockingTime 46.0 33.0 -28.3%
drag-within-list-1000 longTaskCount 2.0 2.0 +0.0%
drag-within-list-1000 layoutCount 5.0 5.0 +0.0%
drag-within-list-1000 recalcStyleCount 46.0 46.0 +0.0%
drag-within-list-1000 avgFrameTime 17.0 16.9 -0.4%
drag-within-list-1000 maxFrameGap 31.1 29.3 -5.8%
drag-within-list-1000 droppedFrames 18.0 19.0 +5.6%
drag-within-list-1000 p99FrameTime 31.1 29.3 -5.8%
drag-within-list-1000 jsHeapDelta 425.9 422.4 -0.8%
dynamic-height-scroll totalBlockingTime 219.0 225.0 +2.7%
dynamic-height-scroll longTaskCount 3.0 3.0 +0.0%
dynamic-height-scroll layoutCount 79.0 80.0 +1.3%
dynamic-height-scroll recalcStyleCount 177.0 177.0 +0.0%
dynamic-height-scroll avgFrameTime 16.6 16.6 +0.1%
dynamic-height-scroll maxFrameGap 26.5 24.5 -7.5%
dynamic-height-scroll droppedFrames 61.0 64.0 +4.9%
dynamic-height-scroll p99FrameTime 25.5 24.2 -5.1%
dynamic-height-scroll jsHeapDelta 569.3 534.5 -6.1%
scroll-2000-items totalBlockingTime 196.0 185.0 -5.6%
scroll-2000-items longTaskCount 2.0 2.0 +0.0%
scroll-2000-items layoutCount 58.0 59.0 +1.7%
scroll-2000-items recalcStyleCount 58.0 59.0 +1.7%
scroll-2000-items avgFrameTime 16.6 16.6 +0.2%
scroll-2000-items maxFrameGap 27.0 25.3 -6.3%
scroll-2000-items droppedFrames 61.0 61.0 +0.0%
scroll-2000-items p99FrameTime 24.7 21.9 -11.3%
scroll-2000-items jsHeapDelta 4642.8 4779.6 +2.9%

No significant regressions detected.

- Use Bessel's correction (n-1) for sample stddev in statistics
- Remove pre-rounding before aggregation in all scenario files
- Fix test name to match actual drag target (item 7, not item 10)
- Extract shared extractScenarios into reusable module
- Add forced GC via CDP before each measurement iteration
- Add droppedFrames (>16.7ms) and p99FrameTime metrics
- Rename report column from p95 to Max (honest with 5 samples)
- Fix misleading warmup wording in report header
- Add waitForLoadState('networkidle') after page navigation
- Add artifact retention-days and TODO for perf:compare in CI
Add baseline comparison step to CI workflow. The PR comment now
includes both raw benchmark results and the diff table against the
committed baseline. Steps use `if: always()` so the comment posts
even when a regression is detected.
@gultyayev gultyayev merged commit d773ae3 into master Mar 6, 2026
3 checks passed
@gultyayev gultyayev deleted the perf-test branch March 6, 2026 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant