ai-summary: detect GitHub timeout-minutes kills via log completion marker by ppetrovicTT · Pull Request #134 · tenstorrent/tt-github-actions

ppetrovicTT · 2026-06-11T17:03:10Z

Problem

A step killed by GitHub timeout-minutes leaves no trace in the tee'd log — the ##[error]... has timed out line exists only in the runner log, which the tool never sees. The log just truncates mid-stream: no error pattern, no failed tests, no exit code. Result: the job classifies as a false 🟢 SUCCESS.

Real example: tt-metal run 27315488300 / job 80696095660 — bh_qb_DeepSeek_PREFILL hit the 10-min step timeout; its AI job summary reported SUCCESS.

In-container signal delivery is unreliable (the runner kills the host-side docker exec shim; pytest never receives SIGINT — verified: no KeyboardInterrupt banner in the final 14s of the log), and API reconciliation is race-prone. So the signal is made in-band: absence of a caller-written final log line.

Mechanism

The caller's test wrapper appends a completion marker as the log's final line:

[==log-finish-line==] exit_code=0

Token presence is the contract; the exit_code=N payload is optional (group 1) and is the only exit-code source for tee'd logs, which lack the GitHub Process completed with exit code line. The regex lives in analysis.yaml as log_complete_marker (default on); a caller can override it in the config it already passes to ai_summary/job.

Classification (marker absent ⇒ shell hard-killed):

Situation	Final status
clean log, marker present	as today; exit code from marker
clean log, marker absent, all dirs present	TIMEOUT (no LLM) — the false-green case
log shows crash / tests failed, marker absent	unchanged status; truncation reported as a possibly independent issue (`_job.log_complete: false`, report note, `INCOMPLETE LOG` section in LLM context)
a log dir missing, marker absent	`infra:partial_logs` — missing dir wins over marker absence

⚠️ Rollout note (default on)

With the marker default on, any wired workflow whose wrapper does not yet emit the line will see its clean jobs flip green→TIMEOUT. Adoption is staged in tt-metal (blackhole-e2e / t3000-e2e / galaxy-deepseek-prefill / demo-sp-release); other callers must pass "log_complete_marker": null until their wrappers emit it.

Testing

New test_log_complete.py (extraction, anchoring, exit-code parsing, status table, report notes, config) + TestTimeoutMarkerAbsent in test_cli.py (no-LLM TIMEOUT short-circuit, missing-dir precedence, marker-disabled).
Full suite: 234 passed.

Follow-ups: patterns→analysis.yaml refactor (next PR), tt-metal wrapper adoption.

🤖 Generated with Claude Code

Copilot

Pull request overview

This PR adds an optional “log completion marker” mechanism to detect cases where a step is hard-killed by GitHub timeout-minutes and the tee’d log truncates without any explicit timeout/error line, preventing false “SUCCESS” classifications.

Changes:

Introduces log_complete_marker config (default null/disabled) and propagates it through config loading and CLI into log extraction.
Extends extraction/status logic to treat a clean log missing the completion marker as TIMEOUT, and surfaces “INCOMPLETE LOG” / summary warnings when the marker is absent.
Adds a dedicated test suite covering marker extraction, status outcomes, report notes, and config default/overlay behavior.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
.github/actions/ai_summary/tool/ai_job_summary/extract.py	Adds `log_complete` field, parses optional completion marker, and uses marker absence to classify clean logs as TIMEOUT; adds “INCOMPLETE LOG” context section.
.github/actions/ai_summary/tool/ai_job_summary/summarize.py	Adds a warning line to the markdown summary when logs are incomplete (marker missing).
.github/actions/ai_summary/tool/ai_job_summary/config/analysis.yaml	Documents and defaults `log_complete_marker: null` in bundled config.
.github/actions/ai_summary/tool/ai_job_summary/config.py	Ensures `log_complete_marker` defaults to `None` when loading config.
.github/actions/ai_summary/tool/ai_job_summary/cli.py	Includes `log_complete_marker` in `test_patterns` passed to extraction and emits `log_complete` in JSON output.
.github/actions/ai_summary/job/README.md	Documents the new optional `log_complete_marker` input and its behavioral implications.
.github/actions/ai_summary/tool/ai_job_summary/tests/test_log_complete.py	Adds tests for marker extraction, status classification, report notes, and config behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    # Completion marker: the caller's wrapper appends it as the log's final
+    # line; absence means the shell was hard-killed (GitHub timeout-minutes),
+    # which leaves no trace in the tee'd log itself.
+    marker_pattern = (test_patterns or {}).get("log_complete_marker")
+    if marker_pattern:
+        if match := re.search(marker_pattern, full_text, re.MULTILINE):
+            result.log_complete = True
+            # group 1 (optional) = wrapped command's exit code
+            if match.groups() and match.group(1) is not None and match.group(1).isdigit():
+                result.exit_code = int(match.group(1))
+        else:
+            result.log_complete = False


+    def test_marker_must_anchor_line_start(self, tmp_path):
+        # embedded in other output (e.g. echoed cmd) must not count
+        lines = CLEAN_LINES + ["echo [==log-finish-line==] exit_code=0"]
+        e = _extract(tmp_path, lines)
+        assert e.log_complete is False


vmilosevic

@ppetrovicTT you can adjust codeowners for ai_summery path so you can merge faster

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

+    marker_pattern = (test_patterns or {}).get("log_complete_marker")
+    if marker_pattern:
+        if match := re.search(marker_pattern, full_text, re.MULTILINE):
+            result.log_complete = True
+            # group 1 (optional) = wrapped command's exit code
+            if match.groups() and match.group(1) is not None and match.group(1).isdigit():
+                result.exit_code = int(match.group(1))
+        else:
+            result.log_complete = False


+    # Marker absent on an otherwise clean log = GitHub timeout-minutes kill.
+    # Crash/failure branches above win: truncation is then reported as a
+    # possibly separate issue, not as the status.
+    if extracted_log.log_complete is False:
+        return JobStatus(False, "RED", "TIMEOUT")


+    if extracted_log and extracted_log.log_complete is False:
+        if job_status.status_text == "TIMEOUT":
+            md += "⚠️ Log ended without its completion marker — step killed by GitHub `timeout-minutes` (test ran too long or hung).\n"
+        else:
+            md += "⚠️ Log is incomplete — step killed by GitHub `timeout-minutes` after the failure below; possibly an independent hang in a later test.\n"


…rker A step killed by GitHub timeout-minutes leaves no trace in the tee'd log (the ##[error] line lives only in the runner log), so a timed-out job classified as a false SUCCESS — e.g. tt-metal run 27315488300/job/80696095660. The caller's test wrapper appends a completion marker as the final log line, '[==log-finish-line==] exit_code=N'. Token presence is the contract; the exit_code payload is optional (and the only exit-code source for tee'd logs, which lack the GitHub "Process completed" line). Marker absent => the shell was hard-killed: - clean log, all dirs present => TIMEOUT (no LLM) - log already shows crash/failure => that status wins, truncation surfaced as a possibly independent issue (report note + _job.log_complete=false + INCOMPLETE LOG section for the LLM) - a missing log dir still wins over marker absence (infra:partial_logs) The marker regex lives in analysis.yaml (log_complete_marker), default on; callers whose wrapper does not write it must override with null. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings June 11, 2026 17:03

ppetrovicTT requested a review from a team as a code owner June 11, 2026 17:03

Copilot started reviewing on behalf of ppetrovicTT June 11, 2026 17:03 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

vmilosevic previously approved these changes Jun 15, 2026

View reviewed changes

ppetrovicTT dismissed vmilosevic’s stale review via 22213ed June 15, 2026 15:07

ppetrovicTT force-pushed the ppetrovic/ai-summary-log-finish-line branch from 2f64ba2 to 22213ed Compare June 15, 2026 15:07

This was referenced Jun 15, 2026

ai-summary: move crash/timeout detection regexes to analysis.yaml #135

Open

actions: add run-with-log composite action #136

Open

Copilot AI review requested due to automatic review settings June 17, 2026 08:36

ppetrovicTT force-pushed the ppetrovic/ai-summary-log-finish-line branch from 22213ed to 88d708e Compare June 17, 2026 08:36

ppetrovicTT changed the base branch from main to ppetrovic/ai-summary-detection-patterns-yaml June 17, 2026 08:36

Copilot AI reviewed Jun 17, 2026

View reviewed changes

ppetrovicTT force-pushed the ppetrovic/ai-summary-detection-patterns-yaml branch from b881057 to 2cd0ec4 Compare June 17, 2026 08:42

ppetrovicTT force-pushed the ppetrovic/ai-summary-log-finish-line branch from 88d708e to d072889 Compare June 17, 2026 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-summary: detect GitHub timeout-minutes kills via log completion marker#134

ai-summary: detect GitHub timeout-minutes kills via log completion marker#134
ppetrovicTT wants to merge 1 commit into
ppetrovic/ai-summary-detection-patterns-yamlfrom
ppetrovic/ai-summary-log-finish-line

ppetrovicTT commented Jun 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

vmilosevic left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ppetrovicTT commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Mechanism

⚠️ Rollout note (default on)

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

vmilosevic left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ppetrovicTT commented Jun 11, 2026 •

edited

Loading