Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/actions/run-with-log/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# `run-with-log`

Run a bash script like a normal `run:` step, but capture its combined output to
a `.log` file and append a completion marker as the final line:

```
[==tt-log-finish-line==] exit_code=<N>
```

The marker is a generic end-of-run sentinel. Its **absence** means the shell was
killed before finishing — most usefully a GitHub `timeout-minutes` kill, which
leaves no trace in the log otherwise. Tooling that reads the logs (e.g. the
`ai_summary` parser) uses this to tell a timeout apart from a crash.

Replaces hand-rolled `tee` / `PIPESTATUS` blocks copied across workflows.

## Usage

```yaml
- name: ${{ matrix.test-group.name }}
timeout-minutes: ${{ matrix.test-group.timeout }}
uses: tenstorrent/tt-github-actions/.github/actions/run-with-log@main
with:
log-file: generated/test_logs/${{ matrix.test-group.name }}.log
run: |
pytest models/demos/... -xv
./some_other_command
```

## Inputs

| Name | Required | Default | Description |
|------|----------|---------|-------------|
| `run` | yes | — | Bash script, exactly as a `run:` step body. |
| `log-file` | yes | — | Path for the `.log`, relative to `working-directory`. Parent dirs are created. |
| `working-directory` | no | `/work` | Defaults to the Tenstorrent container workdir; override per call as on a `run:` step. Needed because composite steps don't inherit the job's `defaults.run.working-directory`. |

## What propagates

- **`if:`, `env:`, `timeout-minutes`** on the calling step — native; `timeout-minutes`
is what makes the marker meaningful (the kill skips it).
- **Exit code / pass-fail** — the script's real exit code is forwarded, so the
step fails when the script fails and `failure()` / `if:` downstream work.
- **The script itself** runs under `bash -eo pipefail` (same as a `run:` step).

**Does not propagate:** named `$GITHUB_OUTPUT` written inside the script — a
composite action can only expose declared outputs, not arbitrary ones. Steps
that must set outputs should stay plain `run:` steps.
47 changes: 47 additions & 0 deletions .github/actions/run-with-log/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Run with log
description: >-
Run a bash script like a normal `run:` step, tee its combined output to a
.log file, and append a completion marker as the final line. The marker is a
generic end-of-run sentinel: its absence means the shell was killed before
finishing (e.g. GitHub timeout-minutes), which downstream tooling can detect.

inputs:
run:
description: "Bash script to run, exactly as a `run:` step body (multi-line)."
required: true
log-file:
description: "Path for the .log, relative to working-directory. Parent dirs are created."
required: true
working-directory:
description: >-
Directory to run in. Defaults to /work (Tenstorrent container jobs).
Override per call as you would on a run: step. Required because composite
steps do not inherit the job's defaults.run.working-directory.
required: false
default: "/work"

runs:
using: composite
steps:
- shell: bash
working-directory: ${{ inputs.working-directory }}
env:
# Via env so the script's quotes/specials reach the file verbatim.
RUN_WITH_LOG_SCRIPT: ${{ inputs.run }}
RUN_WITH_LOG_FILE: ${{ inputs.log-file }}
run: |
# errexit off for the wrapper so the marker is written even when the
# script fails — a failed run must stay a failure, not look like a kill.
set +e
mkdir -p "$(dirname "$RUN_WITH_LOG_FILE")"

# Run as its own bash file, matching a normal run step (errexit +
# pipefail), so the script keeps full fidelity and its own exit code.
script=$(mktemp)
printf '%s\n' "$RUN_WITH_LOG_SCRIPT" > "$script"
bash --noprofile --norc -eo pipefail "$script" 2>&1 | tee "$RUN_WITH_LOG_FILE"
rc=${PIPESTATUS[0]}
rm -f "$script"

printf '[==tt-log-finish-line==] exit_code=%s\n' "$rc" >> "$RUN_WITH_LOG_FILE"
exit $rc
86 changes: 86 additions & 0 deletions .github/workflows/test-run-with-log.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
name: "test: run-with-log"

# Smoke test for the run-with-log action: a completed run writes the marker;
# a timeout-minutes kill does not. Dispatch-only (the timeout case waits ~60s).
on:
workflow_dispatch:

jobs:
success-case:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/run-with-log
with:
working-directory: ${{ github.workspace }}
log-file: out/ok.log
run: |
echo hello
echo world
- name: Assert marker present, exit 0
run: |
tail -3 out/ok.log
tail -1 out/ok.log | grep -qx '\[==tt-log-finish-line==\] exit_code=0'
- name: Summarize (no LLM — clean log short-circuits)
uses: ./.github/actions/ai_summary/job
with:
config: '{"model":"none","workspace":"${{ github.workspace }}","input_dirs":["out"],"output_dir":"out/summaries"}'
api-key: ""
api-url: ""
job-name: smoke-success
- name: Assert classified SUCCESS
run: |
f=$(ls out/summaries/*.json); cat "${f%.json}.md"
python3 -c "import json; s=json.load(open('$f'))['_job']['status']; assert s=='SUCCESS', s"

failure-case:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/run-with-log
id: run
continue-on-error: true
with:
working-directory: ${{ github.workspace }}
log-file: out/fail.log
run: |
echo before
exit 7
- name: Assert failure forwarded, marker carries exit code
run: |
test "${{ steps.run.outcome }}" = failure
tail -1 out/fail.log | grep -qx '\[==tt-log-finish-line==\] exit_code=7'

timeout-case:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/run-with-log
id: run
continue-on-error: true
timeout-minutes: 1
with:
working-directory: ${{ github.workspace }}
log-file: out/timeout.log
run: |
echo started
sleep 120
- name: Assert killed, marker absent
run: |
test "${{ steps.run.outcome }}" != success
grep -q '^started$' out/timeout.log
if grep -q tt-log-finish-line out/timeout.log; then
echo "::error::marker present after timeout kill — bug"; exit 1
fi
echo "marker correctly absent (outcome=${{ steps.run.outcome }})"
- name: Summarize (no LLM — marker-absent short-circuits to TIMEOUT)
uses: ./.github/actions/ai_summary/job
with:
config: '{"model":"none","workspace":"${{ github.workspace }}","input_dirs":["out"],"output_dir":"out/summaries"}'
api-key: ""
api-url: ""
job-name: smoke-timeout
- name: Assert classified TIMEOUT
run: |
f=$(ls out/summaries/*.json); cat "${f%.json}.md"
python3 -c "import json; s=json.load(open('$f'))['_job']['status']; assert s=='TIMEOUT', s"
Loading