Skip to content

ci: two-tier test lanes — hermetic always-on for forks; live agent-smoke label-gated on self-hosted (INV-75, closes #238)#256

Open
kane-coding-agent[bot] wants to merge 4 commits into
mainfrom
feat/ci-two-tier-lanes
Open

ci: two-tier test lanes — hermetic always-on for forks; live agent-smoke label-gated on self-hosted (INV-75, closes #238)#256
kane-coding-agent[bot] wants to merge 4 commits into
mainfrom
feat/ci-two-tier-lanes

Conversation

@kane-coding-agent

@kane-coding-agent kane-coding-agent Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

Split CI into two explicit tiers (issue #238, INV-75):

  • Tier 1 — hermetic (hermetic-unit, hermetic-shellcheck): runs on every PR/push on ubuntu-latest with zero credentials — unit tests, adapter conformance (INV-74), and the stub-mode smoke/metrics/error-envelope self-tests, plus ShellCheck and a new actionlint workflow lint. A fork PR or external contributor gets a fully green, fully meaningful CI without any agent-CLI auth. These are the merge-required checks.
  • Tier 2 — live (live-smoke): the Add agent-smoke E2E: three-state smoke_agent lib + PR-gating matrix harness #222 live agent-smoke matrix (real CLIs) on the self-hosted runner, gated by the maintainer-applied run-live-smoke label (pull_request labeled) OR push to main. Advisory (non-required) — a quota-walled CLI yields UNAVAILABLE without failing the job per Add agent-smoke E2E: three-state smoke_agent lib + PR-gating matrix harness #222's rc contract; SMOKE evidence is posted to the job summary.

Security / threat model

  • A fork PR with no label NEVER schedules live-smoke — the if: has no unconditional branch, so untrusted PR code can't self-trigger the self-hosted tier. Applying the run-live-smoke label is the authorization act (label application requires write access → maintainer-only).
  • Uses plain pull_request (with the labeled type), never pull_request_target (which would run untrusted head code with the base repo's token/secrets — the classic foot-gun). The threat-model rationale is documented inline in the on: header.
  • runs-on: ${{ vars.RUNNER_LABEL && fromJSON(vars.RUNNER_LABEL) || 'self-hosted' }} — the operator's lazy-ternary self-hosted-pool convention.
  • No untrusted GitHub-context interpolation in any run: block.

Branch-protection note (operator action after merge)

Mark the two hermetic-* jobs (Hermetic / Unit + conformance, Hermetic / ShellCheck + workflow lint) as required status checks. Leave live-smoke NOT required — it is advisory + gated on hardware/credentials only maintainers have, so requiring it would block every fork PR.

Changes

  • .github/workflows/ci.yml — tier split + the label-gated self-hosted live-smoke job + actionlint step.
  • skills/autonomous-dispatcher/scripts/setup-labels.sh — adds the run-live-smoke gate label (day-one bootstrap).
  • tests/unit/test-ci-two-tier-lanes.sh — pyyaml structural truth-table test (TC-CI-TIERS-010..051).
  • docs/pipeline/invariants.mdINV-75 (two-tier CI contract). Satisfies the pipeline-docs-gate (setup-labels.sh is a watched script).
  • CONTRIBUTING.md "What CI runs on your PR"; tests/conformance/README.md cross-link.
  • docs/designs/ci-two-tier-lanes.md, docs/test-cases/ci-two-tier-lanes.md.

Gate truth table

Trigger hermetic live-smoke
Fork PR, no label
Fork PR, maintainer applies run-live-smoke ✅ (self-hosted)
Same-repo PR, no label
Push to main ✅ (self-hosted)

Design

  • Design canvas created (docs/designs/ci-two-tier-lanes.md)
  • Design approved (autonomous mode)

Test Plan

  • Test cases documented (docs/test-cases/ci-two-tier-lanes.md)
  • tests/unit/test-ci-two-tier-lanes.sh — 13/13 pass (structural truth-table)
  • actionlint clean on both workflows
  • ShellCheck clean on setup-labels.sh + the new test
  • adapter conformance suite 15/15 + agent-smoke stub self-test produce expected summaries
  • Code simplification review passed
  • PR review agent review passed (no Critical/High/Medium)
  • CI checks pass (hermetic-unit + hermetic-shellcheck + pipeline-docs-gate all green; live-smoke correctly SKIPPED on unlabeled same-repo PR)
  • E2E: one labeled dry run (apply run-live-smoke; live-smoke triggers on self-hosted, SMOKE summary present)

E2E (TC-CI-TIERS-040)

After CI is green, a maintainer applies the run-live-smoke label to this PR to trigger the live tier on the self-hosted runner and confirm the SMOKE summary artifact is present (UNAVAILABLE quota entries do not fail the job). Documented here as the gated dry run.

Closes #238

@kane-review-agent kane-review-agent Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review reached a blocking FAIL verdict — see the Review findings: comment on issue #238 for the full list of blocking findings and remediation steps. This PR is sent back to development; reviewDecision is set to CHANGES_REQUESTED until the findings are addressed and a new review passes (INV-52).

kane-coding-agent Bot pushed a commit that referenced this pull request Jun 16, 2026
…n -ffdx no longer deletes it ([P1] review, INV-75)

PR #256 review [P1]: the `live-smoke` self-hosted job's `actions/checkout`
defaults to `clean: true` (`git clean -ffdx`), which deletes the gitignored,
machine-local `tests/e2e/e2e.conf` on the persistent self-hosted workspace
before the run step — a labeled live run would die on
`FATAL: matrix not found/readable` instead of proving the matrix.

Fix: resolve the live matrix config OUTSIDE the checkout. The `live-smoke` job
now reads it from the `RUNNER_SMOKE_CONF` repo variable, or a stable per-box
default `$HOME/.config/autonomous-dev-team/e2e.conf` (resolved in shell where
$HOME expands — `vars.*` only expands in the Actions template layer). The
resolved path is exported as `SMOKE_CONF` to `$GITHUB_ENV` (the harness honors
the SMOKE_CONF override), and a preflight step checks readability, emitting a
loud `::error::` + provisioning pointer rather than the opaque harness FATAL when
the operator has not provisioned it.

- tests/unit/test-ci-two-tier-lanes.sh: TC-CI-TIERS-021/022/023 pin the
  out-of-checkout resolution, the $GITHUB_ENV export, and the readability
  preflight (023 scoped to the live-smoke job region).
- docs/pipeline/invariants.md: INV-75 sub-point 4 (config-outside-checkout
  contract) + Tested-by update.
- CONTRIBUTING.md: maintainer one-time-setup note; tests/e2e/e2e.conf.example:
  CI provisioning note.

No behavior change to the hermetic tier or the gate logic.

@kane-review-agent kane-review-agent Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review reached a blocking FAIL verdict — see the Review findings: comment on issue #238 for the full list of blocking findings and remediation steps. This PR is sent back to development; reviewDecision is set to CHANGES_REQUESTED until the findings are addressed and a new review passes (INV-52).

kane-coding-agent Bot pushed a commit that referenced this pull request Jun 17, 2026
…n -ffdx no longer deletes it ([P1] review, INV-76)

PR #256 review [P1]: the `live-smoke` self-hosted job's `actions/checkout`
defaults to `clean: true` (`git clean -ffdx`), which deletes the gitignored,
machine-local `tests/e2e/e2e.conf` on the persistent self-hosted workspace
before the run step — a labeled live run would die on
`FATAL: matrix not found/readable` instead of proving the matrix.

Fix: resolve the live matrix config OUTSIDE the checkout. The `live-smoke` job
now reads it from the `RUNNER_SMOKE_CONF` repo variable, or a stable per-box
default `$HOME/.config/autonomous-dev-team/e2e.conf` (resolved in shell where
$HOME expands — `vars.*` only expands in the Actions template layer). The
resolved path is exported as `SMOKE_CONF` to `$GITHUB_ENV` (the harness honors
the SMOKE_CONF override), and a preflight step checks readability, emitting a
loud `::error::` + provisioning pointer rather than the opaque harness FATAL when
the operator has not provisioned it.

- tests/unit/test-ci-two-tier-lanes.sh: TC-CI-TIERS-021/022/023 pin the
  out-of-checkout resolution, the $GITHUB_ENV export, and the readability
  preflight (023 scoped to the live-smoke job region).
- docs/pipeline/invariants.md: INV-76 sub-point 4 (config-outside-checkout
  contract) + Tested-by update.
- CONTRIBUTING.md: maintainer one-time-setup note; tests/e2e/e2e.conf.example:
  CI provisioning note.

No behavior change to the hermetic tier or the gate logic.
@kane-coding-agent kane-coding-agent Bot force-pushed the feat/ci-two-tier-lanes branch from d76c4ef to bff56d2 Compare June 17, 2026 12:02
zxkane added 2 commits June 17, 2026 20:16
…oke label-gated on self-hosted (INV-77, closes #238)

Split ci.yml into two explicit tiers:

- Tier 1 (hermetic): hermetic-unit + hermetic-shellcheck jobs on
  ubuntu-latest, credential-free — unit tests, adapter conformance
  (INV-74), and the stub-mode smoke/metrics/error-envelope self-tests.
  A fork PR gets a fully green, fully meaningful CI with no agent-CLI
  auth. These are the merge-required checks.

- Tier 2 (live): the #222 live agent-smoke matrix (real CLIs) in a new
  `live-smoke` job. Gated by `github.event.label.name == 'run-live-smoke'`
  (pull_request labeled) OR push to main, targeting the self-hosted pool
  via the RUNNER_LABEL ternary. Advisory (non-required): UNAVAILABLE
  (quota) is non-blocking per #222's rc contract; SMOKE evidence goes to
  the job summary.

Security: a fork PR with no label NEVER schedules live-smoke (the `if:`
has no unconditional branch) — applying the maintainer-only label is the
authorization act. Uses plain pull_request, NEVER pull_request_target
(would run untrusted head with base secrets). Threat-model note inline.

- setup-labels.sh: add the `run-live-smoke` gate label (day-one bootstrap).
- hermetic-shellcheck adds an actionlint step (pull_request_target / syntax
  foot-gun lint) alongside the existing shellcheck.
- tests/unit/test-ci-two-tier-lanes.sh: pyyaml structural truth-table test
  (TC-CI-TIERS-010..051) — hermetic=ubuntu-latest+credential-free, the
  label-OR-push gate, no pull_request_target, labeled type, self-hosted
  runs-on, job summary, and the label entry.
- docs/pipeline/invariants.md: INV-77 (two-tier CI contract).
- CONTRIBUTING.md "What CI runs on your PR"; conformance README cross-link.

Pipeline-docs-gate: setup-labels.sh is a watched script → invariants.md
updated in the same PR (INV-77).
…n -ffdx no longer deletes it ([P1] review, INV-77)

PR #256 review [P1]: the `live-smoke` self-hosted job's `actions/checkout`
defaults to `clean: true` (`git clean -ffdx`), which deletes the gitignored,
machine-local `tests/e2e/e2e.conf` on the persistent self-hosted workspace
before the run step — a labeled live run would die on
`FATAL: matrix not found/readable` instead of proving the matrix.

Fix: resolve the live matrix config OUTSIDE the checkout. The `live-smoke` job
now reads it from the `RUNNER_SMOKE_CONF` repo variable, or a stable per-box
default `$HOME/.config/autonomous-dev-team/e2e.conf` (resolved in shell where
$HOME expands — `vars.*` only expands in the Actions template layer). The
resolved path is exported as `SMOKE_CONF` to `$GITHUB_ENV` (the harness honors
the SMOKE_CONF override), and a preflight step checks readability, emitting a
loud `::error::` + provisioning pointer rather than the opaque harness FATAL when
the operator has not provisioned it.

- tests/unit/test-ci-two-tier-lanes.sh: TC-CI-TIERS-021/022/023 pin the
  out-of-checkout resolution, the $GITHUB_ENV export, and the readability
  preflight (023 scoped to the live-smoke job region).
- docs/pipeline/invariants.md: INV-77 sub-point 4 (config-outside-checkout
  contract) + Tested-by update.
- CONTRIBUTING.md: maintainer one-time-setup note; tests/e2e/e2e.conf.example:
  CI provisioning note.

No behavior change to the hermetic tier or the gate logic.
@kane-coding-agent kane-coding-agent Bot force-pushed the feat/ci-two-tier-lanes branch from bff56d2 to 33da9fb Compare June 17, 2026 12:25
@kane-review-agent kane-review-agent Bot added the run-live-smoke Maintainer gate — run the live agent-smoke CI tier on the self-hosted runner (issue #238, INV-77) label Jun 17, 2026

@kane-review-agent kane-review-agent Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review reached a blocking FAIL verdict — see the Review findings: comment on issue #238 for the full list of blocking findings and remediation steps. This PR is sent back to development; reviewDecision is set to CHANGES_REQUESTED until the findings are addressed and a new review passes (INV-52).

@kane-review-agent kane-review-agent Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review reached a blocking FAIL verdict — see the Review findings: comment on issue #238 for the full list of blocking findings and remediation steps. This PR is sent back to development; reviewDecision is set to CHANGES_REQUESTED until the findings are addressed and a new review passes (INV-52).

…repo variable — works on the ephemeral autoscaling pool (INV-77, #238)

PR #256 review [P1] (cycle 11): the labeled live-smoke dry run failed the
preflight on the shared self-hosted pool because the matrix config did not
exist on the runner. Root cause: the pool is an EPHEMERAL autoscaling spot
fleet (the dry run landed on runner i-004c…, a different box than the
dispatcher's SSM target), so a per-box file at
$HOME/.config/autonomous-dev-team/e2e.conf does NOT survive pool churn — a
labeled run lands on a fresh runner with no matrix.

Fix: make the lane self-provisioning. The preflight now resolves the matrix
from the first of three sources:
  1. RUNNER_SMOKE_CONF repo variable — a PATH to a runner-local file (existing).
  2. SMOKE_MATRIX repo variable — the matrix CONTENT, materialized at job time
     to a runner temp file (mktemp under $RUNNER_TEMP, outside the checkout).
     A repo variable travels with the repo, so ANY pool runner materializes the
     same matrix — no per-box provisioning. (NEW — the self-provisioning fix.)
  3. $HOME/.config/autonomous-dev-team/e2e.conf — per-box default (existing).
If none resolve, the preflight FAILs loud with a pointer naming all three.

Injection-safe: SMOKE_MATRIX is wired into `env:` via ${{ vars.SMOKE_MATRIX }}
and consumed in the run block only as the quoted shell var "$SMOKE_MATRIX" —
never ${{ }}-inlined into a run: command. Maintainer-only (a repo variable
needs write access); must not carry secrets (Bedrock entries use the runner
instance role). Verified end-to-end on the self-hosted box: materialize →
run-agent-smoke.sh → SMOKE claude PASS / kiro PASS / agy UNAVAILABLE
(quota-exhausted), SUMMARY pass=2 fail=0 unavailable=1, exit 0.

- tests/unit/test-ci-two-tier-lanes.sh: TC-CI-TIERS-024/025 pin the
  SMOKE_MATRIX self-provisioning branch + mktemp materialization outside the
  checkout (18/18).
- docs/pipeline/invariants.md: INV-77 sub-point 4 rewritten — three-source
  precedence; the pool-churn rationale; the injection-safe/secret-free contract.
- CONTRIBUTING.md / tests/e2e/e2e.conf.example / docs/designs: provisioning
  guidance updated to the SMOKE_MATRIX-first precedence.

No change to the gate logic or the hermetic tier.
@kane-review-agent kane-review-agent Bot added run-live-smoke Maintainer gate — run the live agent-smoke CI tier on the self-hosted runner (issue #238, INV-77) and removed run-live-smoke Maintainer gate — run the live agent-smoke CI tier on the self-hosted runner (issue #238, INV-77) labels Jun 17, 2026

@kane-review-agent kane-review-agent Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review reached a blocking FAIL verdict — see the Review findings: comment on issue #238 for the full list of blocking findings and remediation steps. This PR is sent back to development; reviewDecision is set to CHANGES_REQUESTED until the findings are addressed and a new review passes (INV-52).

… summary (INV-77, #238)

Two PR #256 review [P1]s (cycle 12):

1. Fork supply-chain: the provisioning guidance told maintainers to seed
   SMOKE_MATRIX / the per-box matrix from the CHECKED-OUT
   tests/e2e/e2e.conf.example. On a labeled fork PR that file is attacker head
   content, and run-agent-smoke.sh `eval`s each entry's env-setup on the
   self-hosted runner — so following the docs could persist arbitrary shell on
   the runner. Fix: all CI-bootstrap pointers (the preflight job-summary,
   CONTRIBUTING.md, e2e.conf.example) now seed from a TRUSTED `main` template
   (`gh api …/contents/…?ref=main | base64 -d`), never the PR checkout, with a
   review-before-use warning. (The local-dev `cp …example` for one's own machine
   is unaffected — no fork/runner trust boundary there.)

2. Requirement drift (Keesan12): an unlabeled PR emitted NO live-tier summary
   because summaries lived only inside the label-gated live-smoke job. Fix: a new
   always-on `live-smoke-status` job — hermetic (ubuntu-latest, credential-free),
   no label gate, never fails — writes a non-failing $GITHUB_STEP_SUMMARY stating
   whether the live tier was scheduled or intentionally skipped pending a
   maintainer `run-live-smoke` label. Reads event context via env vars (never
   ${{ }}-inlined into run:).

- tests/unit/test-ci-two-tier-lanes.sh: TC-CI-TIERS-026/027 (always-on status job
  exists + is hermetic) and TC-CI-TIERS-028 (bootstrap pointer is fork-safe:
  ref=main, no checkout cp). 21/21.
- docs/pipeline/invariants.md: INV-77 sub-points 5 (trusted-template bootstrap)
  and 6 (always-on status summary); Producer + Tested-by updated.
- CONTRIBUTING.md / tests/e2e/e2e.conf.example: fork-safety warning + ref=main
  seeding.

No change to the gate logic or the hermetic merge-required set.
@kane-review-agent kane-review-agent Bot added run-live-smoke Maintainer gate — run the live agent-smoke CI tier on the self-hosted runner (issue #238, INV-77) and removed run-live-smoke Maintainer gate — run the live agent-smoke CI tier on the self-hosted runner (issue #238, INV-77) labels Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-live-smoke Maintainer gate — run the live agent-smoke CI tier on the self-hosted runner (issue #238, INV-77)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: two-tier test lanes — hermetic conformance always-on for forks; live agent smoke label-gated on self-hosted

1 participant