Skip to content

ai-summary: ignore errors a test declared expected#133

Open
ppetrovicTT wants to merge 1 commit into
mainfrom
ppetrovic/ai-summary-negative-test
Open

ai-summary: ignore errors a test declared expected#133
ppetrovicTT wants to merge 1 commit into
mainfrom
ppetrovic/ai-summary-negative-test

Conversation

@ppetrovicTT

Copy link
Copy Markdown
Contributor

Lines inside an [EXPECTED_ERROR BEGIN message="..."]...[END] bracket (emitted by the expect_error pytest fixture) that match the declared message are masked before crash detection AND section extraction. A deliberately provoked error (e.g. a negative test's TT_FATAL) no longer triggers a false CRASH, while a different error in the same block — or anything outside a bracket — still does.

Lines inside an [EXPECTED_ERROR BEGIN <Type> message="..."]...[END] bracket
(emitted by the expect_error pytest fixture) that match the declared message are
masked before crash detection AND section extraction. A deliberately provoked
error (e.g. a negative test's TT_FATAL) no longer triggers a false CRASH, while
a different error in the same block — or anything outside a bracket — still does.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 10, 2026 10:57
@ppetrovicTT ppetrovicTT requested a review from a team as a code owner June 10, 2026 10:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the AI log extraction pipeline to suppress developer-declared expected error lines (via EXPECTED_ERROR log brackets) before downstream crash detection and error section extraction, reducing false “CRASHED” classifications for negative tests.

Changes:

  • Added parsing + masking for [EXPECTED_ERROR BEGIN ... message=...] ... [EXPECTED_ERROR END] blocks so declared expected error lines don’t influence crash detection or extracted error sections.
  • Refactored crash detection into clearer buckets (hard crash / collection crash / killed / TT_FATAL+TT_THROW / selected Python exceptions).
  • Applied expected-error masking early in extract_log() so it affects all later processing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +283 to +285
"""Line ranges of properly closed [EXPECTED_TT_ERROR BEGIN match=...]...[END]
blocks, paired with the declared match pattern. An unclosed block is ignored,
so its contents are never masked."""
Comment on lines +300 to +304
"""Return a copy of `lines` with developer-declared expected errors blanked out.
For each closed [EXPECTED_TT_ERROR BEGIN match='<pat>']...[END] bracket, the two
marker lines and any in-between line matching <pat> are replaced with "" — so they
are seen by neither crash detection nor section extraction. Everything else — a
different line in the block, or anything outside a bracket — is untouched."""
Comment on lines +310 to +311
masked[start] = "" # [EXPECTED_TT_ERROR BEGIN ...] marker
masked[end] = "" # [EXPECTED_TT_ERROR END] marker
Comment on lines +432 to +434
# Strip developer-declared expected errors (expect_tt_error brackets) up front, so
# they're excluded from EVERYTHING downstream — crash detection and the error
# sections handed to the LLM alike.
Comment on lines +450 to +452
# TT_THROW/TT_FATAL means tt-metal crashed, which cascades to vLLM timeout and test
# failures. (full_text already has expect_tt_error-declared errors masked out above.)
#
Comment on lines +432 to +436
# Strip developer-declared expected errors (expect_tt_error brackets) up front, so
# they're excluded from EVERYTHING downstream — crash detection and the error
# sections handed to the LLM alike.
lines = _mask_expected_errors(lines)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants