ai-summary: detect GitHub timeout-minutes kills via log completion marker#134
Open
ppetrovicTT wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds an optional “log completion marker” mechanism to detect cases where a step is hard-killed by GitHub timeout-minutes and the tee’d log truncates without any explicit timeout/error line, preventing false “SUCCESS” classifications.
Changes:
- Introduces
log_complete_markerconfig (defaultnull/disabled) and propagates it through config loading and CLI into log extraction. - Extends extraction/status logic to treat a clean log missing the completion marker as
TIMEOUT, and surfaces “INCOMPLETE LOG” / summary warnings when the marker is absent. - Adds a dedicated test suite covering marker extraction, status outcomes, report notes, and config default/overlay behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/actions/ai_summary/tool/ai_job_summary/extract.py | Adds log_complete field, parses optional completion marker, and uses marker absence to classify clean logs as TIMEOUT; adds “INCOMPLETE LOG” context section. |
| .github/actions/ai_summary/tool/ai_job_summary/summarize.py | Adds a warning line to the markdown summary when logs are incomplete (marker missing). |
| .github/actions/ai_summary/tool/ai_job_summary/config/analysis.yaml | Documents and defaults log_complete_marker: null in bundled config. |
| .github/actions/ai_summary/tool/ai_job_summary/config.py | Ensures log_complete_marker defaults to None when loading config. |
| .github/actions/ai_summary/tool/ai_job_summary/cli.py | Includes log_complete_marker in test_patterns passed to extraction and emits log_complete in JSON output. |
| .github/actions/ai_summary/job/README.md | Documents the new optional log_complete_marker input and its behavioral implications. |
| .github/actions/ai_summary/tool/ai_job_summary/tests/test_log_complete.py | Adds tests for marker extraction, status classification, report notes, and config behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+441
to
+452
| # Completion marker: the caller's wrapper appends it as the log's final | ||
| # line; absence means the shell was hard-killed (GitHub timeout-minutes), | ||
| # which leaves no trace in the tee'd log itself. | ||
| marker_pattern = (test_patterns or {}).get("log_complete_marker") | ||
| if marker_pattern: | ||
| if match := re.search(marker_pattern, full_text, re.MULTILINE): | ||
| result.log_complete = True | ||
| # group 1 (optional) = wrapped command's exit code | ||
| if match.groups() and match.group(1) is not None and match.group(1).isdigit(): | ||
| result.exit_code = int(match.group(1)) | ||
| else: | ||
| result.log_complete = False |
Comment on lines
+62
to
+66
| def test_marker_must_anchor_line_start(self, tmp_path): | ||
| # embedded in other output (e.g. echoed cmd) must not count | ||
| lines = CLEAN_LINES + ["echo [==log-finish-line==] exit_code=0"] | ||
| e = _extract(tmp_path, lines) | ||
| assert e.log_complete is False |
vmilosevic
previously approved these changes
Jun 15, 2026
vmilosevic
left a comment
Contributor
There was a problem hiding this comment.
@ppetrovicTT you can adjust codeowners for ai_summery path so you can merge faster
2f64ba2 to
22213ed
Compare
This was referenced Jun 15, 2026
22213ed to
88d708e
Compare
Comment on lines
+427
to
+435
| marker_pattern = (test_patterns or {}).get("log_complete_marker") | ||
| if marker_pattern: | ||
| if match := re.search(marker_pattern, full_text, re.MULTILINE): | ||
| result.log_complete = True | ||
| # group 1 (optional) = wrapped command's exit code | ||
| if match.groups() and match.group(1) is not None and match.group(1).isdigit(): | ||
| result.exit_code = int(match.group(1)) | ||
| else: | ||
| result.log_complete = False |
Comment on lines
+188
to
+192
| # Marker absent on an otherwise clean log = GitHub timeout-minutes kill. | ||
| # Crash/failure branches above win: truncation is then reported as a | ||
| # possibly separate issue, not as the status. | ||
| if extracted_log.log_complete is False: | ||
| return JobStatus(False, "RED", "TIMEOUT") |
Comment on lines
+389
to
+393
| if extracted_log and extracted_log.log_complete is False: | ||
| if job_status.status_text == "TIMEOUT": | ||
| md += "⚠️ Log ended without its completion marker — step killed by GitHub `timeout-minutes` (test ran too long or hung).\n" | ||
| else: | ||
| md += "⚠️ Log is incomplete — step killed by GitHub `timeout-minutes` after the failure below; possibly an independent hang in a later test.\n" |
b881057 to
2cd0ec4
Compare
…rker
A step killed by GitHub timeout-minutes leaves no trace in the tee'd log
(the ##[error] line lives only in the runner log), so a timed-out job
classified as a false SUCCESS — e.g. tt-metal run 27315488300/job/80696095660.
The caller's test wrapper appends a completion marker as the final log line,
'[==log-finish-line==] exit_code=N'. Token presence is the contract; the
exit_code payload is optional (and the only exit-code source for tee'd logs,
which lack the GitHub "Process completed" line). Marker absent => the shell
was hard-killed:
- clean log, all dirs present => TIMEOUT (no LLM)
- log already shows crash/failure => that status wins, truncation surfaced
as a possibly independent issue (report note + _job.log_complete=false +
INCOMPLETE LOG section for the LLM)
- a missing log dir still wins over marker absence (infra:partial_logs)
The marker regex lives in analysis.yaml (log_complete_marker), default on;
callers whose wrapper does not write it must override with null.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
88d708e to
d072889
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A step killed by GitHub
timeout-minutesleaves no trace in the tee'd log — the##[error]... has timed outline exists only in the runner log, which the tool never sees. The log just truncates mid-stream: no error pattern, no failed tests, no exit code. Result: the job classifies as a false 🟢 SUCCESS.Real example: tt-metal run 27315488300 / job 80696095660 —
bh_qb_DeepSeek_PREFILLhit the 10-min step timeout; its AI job summary reported SUCCESS.In-container signal delivery is unreliable (the runner kills the host-side
docker execshim; pytest never receives SIGINT — verified: no KeyboardInterrupt banner in the final 14s of the log), and API reconciliation is race-prone. So the signal is made in-band: absence of a caller-written final log line.Mechanism
The caller's test wrapper appends a completion marker as the log's final line:
Token presence is the contract; the
exit_code=Npayload is optional (group 1) and is the only exit-code source for tee'd logs, which lack the GitHubProcess completed with exit codeline. The regex lives inanalysis.yamlaslog_complete_marker(default on); a caller can override it in the config it already passes toai_summary/job.Classification (marker absent ⇒ shell hard-killed):
_job.log_complete: false, report note,INCOMPLETE LOGsection in LLM context)infra:partial_logs— missing dir wins over marker absenceWith the marker default on, any wired workflow whose wrapper does not yet emit the line will see its clean jobs flip green→TIMEOUT. Adoption is staged in tt-metal (blackhole-e2e / t3000-e2e / galaxy-deepseek-prefill / demo-sp-release); other callers must pass
"log_complete_marker": nulluntil their wrappers emit it.Testing
test_log_complete.py(extraction, anchoring, exit-code parsing, status table, report notes, config) +TestTimeoutMarkerAbsentintest_cli.py(no-LLM TIMEOUT short-circuit, missing-dir precedence, marker-disabled).Follow-ups: patterns→
analysis.yamlrefactor (next PR), tt-metal wrapper adoption.🤖 Generated with Claude Code