Skip to content

ai-summary: don't capture adjacent test on bare FAILED line#132

Merged
ppetrovicTT merged 1 commit into
mainfrom
ppetrovic/ai-summary-failed-test-newline
Jun 16, 2026
Merged

ai-summary: don't capture adjacent test on bare FAILED line#132
ppetrovicTT merged 1 commit into
mainfrom
ppetrovic/ai-summary-failed-test-newline

Conversation

@ppetrovicTT

Copy link
Copy Markdown
Contributor

The failed_test_patterns rule FAILED\s+(\S+::\S+) used \s, which matches newlines. pytest -vvv prints a bare FAILED status word on its own line; the regex then bound it to the next line's node id. When that neighbour was another test's result (e.g. an adjacent SKIPPED test), it got reported and counted as a failure.

Constrain to same-line whitespace ([ \t]+) so the node id must follow FAILED on the same line. The real short test summary info line (FAILED path::test - reason) still matches.

Repro from run 26948207149 (bh_lb_DeepSeek_PREFILL_PERF): summary said "2 failed" listing test_deepseek_v3_mla_perf_galaxy (SKIPPED) alongside the real test_deepseek_v3_mla_perf_loudbox failure.

The failed_test_patterns rule `FAILED\s+(\S+::\S+)` used `\s`, which
matches newlines. pytest -vvv prints a bare `FAILED` status word on its
own line; the regex then bound it to the *next* line's node id. When that
neighbour was another test's result (e.g. an adjacent SKIPPED test), it
got reported and counted as a failure.

Constrain to same-line whitespace (`[ \t]+`) so the node id must follow
`FAILED` on the same line. The real `short test summary info` line
(`FAILED path::test - reason`) still matches.

Repro from run 26948207149 (bh_lb_DeepSeek_PREFILL_PERF): summary said
"2 failed" listing test_deepseek_v3_mla_perf_galaxy (SKIPPED) alongside
the real test_deepseek_v3_mla_perf_loudbox failure.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 10, 2026 10:56
@ppetrovicTT ppetrovicTT requested a review from a team as a code owner June 10, 2026 10:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a false-positive failure extraction bug in the AI job summary tool where a regex could match across newlines and incorrectly bind a bare FAILED status line to the next line’s test node id (e.g., an adjacent SKIPPED test), inflating failure counts.

Changes:

  • Update failed_test_patterns to restrict whitespace after FAILED to same-line spaces/tabs ([ \t]+) instead of \s+ (which can match newlines).
  • Add a regression fixture log demonstrating the FAILED-then-adjacent-SKIPPED scenario.
  • Add a unit test that loads the real bundled config and asserts only the true failed test is captured.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
.github/actions/ai_summary/tool/ai_job_summary/tests/test_extraction.py Adds regression coverage to ensure failed-test extraction doesn’t jump to adjacent tests across newlines.
.github/actions/ai_summary/tool/ai_job_summary/tests/fixtures/log_samples/pytest_failed_adjacent_skip.txt New pytest output fixture reproducing the bare FAILED line followed by a SKIPPED test line.
.github/actions/ai_summary/tool/ai_job_summary/config/analysis.yaml Tightens the FAILED ... regex to prevent newline-crossing matches while still matching pytest short summary lines.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ppetrovicTT ppetrovicTT merged commit 97811af into main Jun 16, 2026
5 checks passed
@ppetrovicTT ppetrovicTT deleted the ppetrovic/ai-summary-failed-test-newline branch June 16, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants