| name | ship | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| preamble-tier | 4 | ||||||||
| version | 1.0.0 | ||||||||
| description | Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Use when asked to "ship", "deploy", "push to main", "create a PR", "merge and push", or "get it deployed". Proactively invoke this skill (do NOT push/PR directly) when the user says code is ready, asks about deploying, wants to push code up, or asks to create a PR. (gstack) | ||||||||
| allowed-tools |
|
||||||||
| sensitive | true | ||||||||
| triggers |
|
Caution
Do not touch — imported from gstack. Editing this file forfeits clean upgrades.
Generated by .github-gstack-intelligence/lifecycle/refresh.ts.
Source: garrytan/gstack @ ref main from ship/SKILL.md.tmpl.
This copy is adapted for GitHub-native execution and refresh-time extraction.
Re-run run-refresh-gstack to pull upstream gstack changes back into this repository.
- This is the extracted
/shipskill prompt committed into the repository at refresh time. - Inject GitHub workflow context directly in the invoking lifecycle code instead of relying on local preamble expansion.
- Replace interactive approval steps with issue or pull-request comments plus a follow-up GitHub event.
- Use repository-local reference files under
.github-gstack-intelligence/skills/references/instead of.github-gstack-intelligence/skills/...paths.
Use the GitHub event payload, checked-out refs, and repository default branch to determine the review base branch.
You are running the /ship workflow. This is a non-interactive, fully automated workflow. Do NOT ask for confirmation at any step. The user said /ship which means DO IT. Run straight through and output the PR URL at the end.
Only stop for:
- On the base branch (abort)
- Merge conflicts that can't be auto-resolved (stop, show conflicts)
- In-branch test failures (pre-existing failures are triaged, not auto-blocking)
- Pre-landing review finds ASK items that need user judgment
- MINOR or MAJOR version bump needed (ask — see Step 12)
- Greptile review comments that need user decision (complex fixes, false positives)
- AI-assessed coverage below minimum threshold (hard gate with user override — see Step 7)
- Plan items NOT DONE with no user override (see Step 8)
- Plan verification failures (see Step 8.1)
- TODOS.md missing and user wants to create one (ask — see Step 14)
- TODOS.md disorganized and user wants to reorganize (ask — see Step 14)
Never stop for:
- Uncommitted changes (always include them)
- Version bump choice (auto-pick MICRO or PATCH — see Step 12)
- CHANGELOG content (auto-generate from diff)
- Commit message approval (auto-commit)
- Multi-file changesets (auto-split into bisectable commits)
- TODOS.md completed-item detection (auto-mark)
- Auto-fixable review findings (dead code, N+1, stale comments — fixed automatically)
- Test coverage gaps within target threshold (auto-generate and commit, or flag in PR body)
Re-run behavior (idempotency):
Re-running /ship means "run the whole checklist again." Every verification step
(tests, coverage audit, plan completion, pre-landing review, adversarial review,
VERSION/CHANGELOG check, TODOS, document-release) runs on every invocation.
Only actions are idempotent:
- Step 12: If VERSION already bumped, skip the bump but still read the version
- Step 17: If already pushed, skip the push command
- Step 19: If PR exists, update the body instead of creating a new PR
Never skip a verification step because a prior
/shiprun already performed it.
-
Check the current branch. If on the base branch or the repo's default branch, abort: "You're on the base branch. Ship from a feature branch."
-
Run
git status(never use-uall). Uncommitted changes are always included — no need to ask. -
Run
git diff <base>...HEAD --statandgit log <base>..HEAD --onelineto understand what's being shipped. -
Check review readiness:
Check for prior review results in .github-gstack-intelligence/state/results/ and GitHub PR review status.
If the Eng Review is NOT "CLEAR":
Print: "No prior eng review found — ship will run its own pre-landing review in Step 9."
Check diff size: git diff <base>...HEAD --stat | tail -1. If the diff is >200 lines, add: "Note: This is a large diff. Consider running /plan-eng-review or /autoplan for architecture-level review before shipping."
If CEO Review is missing, mention as informational ("CEO Review not run — recommended for product changes") but do NOT block.
For Design Review: run source <(git diff --name-only <base> 2>/dev/null). If SCOPE_FRONTEND=true and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
If the diff introduces a new standalone artifact (CLI binary, library package, tool) — not a web service with existing deployment — verify that a distribution pipeline exists.
-
Check if the diff adds a new
cmd/directory,main.go, orbin/entry point:git diff origin/<base> --name-only | grep -E '(cmd/.*/main\.go|bin/|Cargo\.toml|setup\.py|package\.json)' | head -5
-
If new artifact detected, check for a release workflow:
ls .github/workflows/ 2>/dev/null | grep -iE 'release|publish|dist' grep -qE 'release|publish|deploy' .gitlab-ci.yml 2>/dev/null && echo "GITLAB_CI_RELEASE"
-
If no release pipeline exists and a new artifact was added: Use GitHub follow-up comment:
- "This PR adds a new binary/tool but there's no CI/CD pipeline to build and publish it. Users won't be able to download the artifact after merge."
- A) Add a release workflow now (CI/CD release pipeline — GitHub Actions or GitLab CI depending on platform)
- B) Defer — add to TODOS.md
- C) Not needed — this is internal/web-only, existing deployment covers it
-
If release pipeline exists: Continue silently.
-
If no new artifact detected: Skip silently.
Fetch and merge the base branch into the feature branch so tests run against the merged state:
git fetch origin <base> && git merge origin/<base> --no-editIf there are merge conflicts: Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, STOP and show them.
If already up to date: Continue silently.
Use the repository's existing test setup. If no test framework is detected, note it in the output and continue.
Do NOT run RAILS_ENV=test bin/rails db:migrate — bin/test-lane already calls
db:test:prepare internally, which loads the schema into the correct lane database.
Running bare test migrations without INSTANCE hits an orphan DB and corrupts structure.sql.
Run both test suites in parallel:
bin/test-lane 2>&1 | tee /tmp/ship_tests.txt &
npm run test 2>&1 | tee /tmp/ship_vitest.txt &
waitAfter both complete, read the output files and check pass/fail.
If any test fails: Do NOT immediately stop. Apply the Test Failure Ownership Triage:
Triage test failures by checking whether they are pre-existing (present on the base branch) or introduced by the current changes. Pre-existing failures should be noted but not block the workflow.
After triage: If any in-branch failures remain unfixed, STOP. Do not proceed. If all failures were pre-existing and handled (fixed, TODOed, assigned, or skipped), continue to Step 6.
If all pass: Continue silently — just note the counts briefly.
Evals are mandatory when prompt-related files change. Skip this step entirely if no prompt files are in the diff.
1. Check if the diff touches prompt-related files:
git diff origin/<base> --name-onlyMatch against these patterns (from CLAUDE.md):
app/services/*_prompt_builder.rbapp/services/*_generation_service.rb,*_writer_service.rb,*_designer_service.rbapp/services/*_evaluator.rb,*_scorer.rb,*_classifier_service.rb,*_analyzer.rbapp/services/concerns/*voice*.rb,*writing*.rb,*prompt*.rb,*token*.rbapp/services/chat_tools/*.rb,app/services/x_thread_tools/*.rbconfig/system_prompts/*.txttest/evals/**/*(eval infrastructure changes affect all suites)
If no matches: Print "No prompt-related files changed — skipping evals." and continue to Step 9.
2. Identify affected eval suites:
Each eval runner (test/evals/*_eval_runner.rb) declares PROMPT_SOURCE_FILES listing which source files affect it. Grep these to find which suites match the changed files:
grep -l "changed_file_basename" test/evals/*_eval_runner.rbMap runner → test file: post_generation_eval_runner.rb → post_generation_eval_test.rb.
Special cases:
- Changes to
test/evals/judges/*.rb,test/evals/support/*.rb, ortest/evals/fixtures/affect ALL suites that use those judges/support files. Check imports in the eval test files to determine which. - Changes to
config/system_prompts/*.txt— grep eval runners for the prompt filename to find affected suites. - If unsure which suites are affected, run ALL suites that could plausibly be impacted. Over-testing is better than missing a regression.
3. Run affected suites at EVAL_JUDGE_TIER=full:
/ship is a pre-merge gate, so always use full tier (Sonnet structural + Opus persona judges).
EVAL_JUDGE_TIER=full EVAL_VERBOSE=1 bin/test-lane --eval test/evals/<suite>_eval_test.rb 2>&1 | tee /tmp/ship_evals.txtIf multiple suites need to run, run them sequentially (each needs a test lane). If the first suite fails, stop immediately — don't burn API cost on remaining suites.
4. Check results:
- If any eval fails: Show the failures, the cost dashboard, and STOP. Do not proceed.
- If all pass: Note pass counts and cost. Continue to Step 9.
5. Save eval output — include eval results and cost dashboard in the PR body (Step 19).
Tier reference (for context — /ship always uses full):
| Tier | When | Speed (cached) | Cost |
|---|---|---|---|
fast (Haiku) |
Dev iteration, smoke tests | ~5s (14x faster) | ~$0.07/run |
standard (Sonnet) |
Default dev, bin/test-lane --eval |
~17s (4x faster) | ~$0.37/run |
full (Opus persona) |
/ship and pre-merge |
~72s (baseline) | ~$1.27/run |
Dispatch this step as a subagent using the Agent tool with subagent_type: "general-purpose". The subagent runs the coverage audit in a fresh context window — the parent only sees the conclusion, not intermediate file reads. This is context-rot defense.
Subagent prompt: Pass the following instructions to the subagent, with <base> substituted with the base branch:
You are running a ship-workflow test coverage audit. Run
git diff <base>...HEADas needed. Do not commit or push — report only.Use the checked-out repository diff and existing tests to reason about coverage; persist only the final GitHub-native findings.
After your analysis, output a single JSON object on the LAST LINE of your response (no other text after it):
{"coverage_pct":N,"gaps":N,"diagram":"<full markdown coverage diagram for PR body>","tests_added":["path",...]}
Parent processing:
- Read the subagent's final output. Parse the LAST line as JSON.
- Store
coverage_pct(for Step 20 metrics),gaps(user summary),tests_added(for the commit). - Embed
diagramverbatim in the PR body's## Test Coveragesection (Step 19). - Print a one-line summary:
Coverage: {coverage_pct}%, {gaps} gaps. {tests_added.length} tests added.
If the subagent fails, times out, or returns invalid JSON: Fall back to running the audit inline in the parent. Do not block /ship on subagent failure — partial results are better than none.
Dispatch this step as a subagent using the Agent tool with subagent_type: "general-purpose". The subagent reads the plan file and every referenced code file in its own fresh context. Parent gets only the conclusion.
Subagent prompt: Pass these instructions to the subagent:
You are running a ship-workflow plan completion audit. The base branch is
<base>. Usegit diff <base>...HEADto see what shipped. Do not commit or push — report only.Check whether the implementation matches the stated plan. If no plan artifact exists, use the PR body and issue context as the plan reference.
After your analysis, output a single JSON object on the LAST LINE of your response (no other text after it):
{"total_items":N,"done":N,"changed":N,"deferred":N,"unverifiable":N,"summary":"<markdown checklist for PR body>"}
Parent processing:
- Parse the LAST line of the subagent's output as JSON.
- Store
done,deferred,unverifiablefor Step 20 metrics; usesummaryin PR body. - If
deferred > 0orunverifiable > 0and no user override, present the items via the appropriate GitHub follow-up comment (see Gate Logic priority order above) before continuing. - Embed
summaryin PR body's## Plan Completionsection (Step 19). Ifunverifiable > 0and the user picked option A in the UNVERIFIABLE gate, also embed## Plan Completion — Manual Verificationslisting each user-confirmed item.
If the subagent fails or returns invalid JSON: Fall back to running the audit inline (parent processes the same plan-extraction + classification logic). If the inline fallback also fails (e.g., plan file unreadable, parser error), do NOT silently pass — surface the failure as an explicit GitHub follow-up comment: "Plan Completion audit could not run ({reason}). Options: (A) Skip audit and ship anyway — record that the audit was skipped in PR body and Step 20 metrics; (B) Stop and fix the audit." Default and recommended option is (B). Silent fail-open is the failure shape that VAS-449 surfaced.
Verify that the implementation matches the stated plan by comparing the diff against any plan artifacts, issue descriptions, or PR bodies.
{{LEARNINGS_SEARCH:query=release ship version changelog merge pr}}
Monitor for scope drift by comparing the current diff against the original plan or PR description. Flag changes that extend beyond the stated scope.
Review the diff for structural issues that tests don't catch.
-
Read
.github-gstack-intelligence/skills/references/review-checklist.md. If the file cannot be read, STOP and report the error. -
Run
git diff origin/<base>to get the full diff (scoped to feature changes against the freshly-fetched base branch). -
Apply the review checklist in two passes:
- Pass 1 (CRITICAL): SQL & Data Safety, LLM Output Trust Boundary
- Pass 2 (INFORMATIONAL): All remaining categories
Use the same confidence-gated reporting thresholds, but publish findings through GitHub comments and repository-local state instead of local CLI interactions.
If frontend files changed, use .github-gstack-intelligence/skills/references/review-design-checklist.md as the design-review-lite source.
Include any design findings alongside the code review findings. They follow the same Fix-First flow below.
In CI mode, a single reviewer handles all categories. If the diff is large (>500 lines) or touches security-sensitive files, note in the output that a second human review is recommended for the affected areas. Do not attempt to spawn parallel reviewers.
Before reporting findings, deduplicate: if the same file:line appears in multiple checklist categories, merge into a single finding with the highest severity. Group related findings (e.g., missing null check + missing test for that path) into one actionable item.
-
Classify each finding from both the checklist pass and specialist review (Step 9.1-Step 9.2) as AUTO-FIX or ASK per the Fix-First Heuristic in checklist.md. Critical findings lean toward ASK; informational lean toward AUTO-FIX.
-
Auto-fix all AUTO-FIX items. Apply each fix. Output one line per fix:
[AUTO-FIXED] [file:line] Problem → what you did -
If ASK items remain, present them in ONE GitHub follow-up comment:
- List each with number, severity, problem, recommended fix
- Per-item options: A) Fix B) Skip
- Overall RECOMMENDATION
- If 3 or fewer ASK items, you may use individual GitHub follow-up comment calls instead
-
After all fixes (auto + user-approved):
- If ANY fixes were applied: commit fixed files by name (
git add <fixed-files> && git commit -m "fix: pre-landing review fixes"), then STOP and tell the user to run/shipagain to re-test. - If no fixes applied (all ASK items skipped, or no issues found): continue to Step 12.
- If ANY fixes were applied: commit fixed files by name (
-
Output summary:
Pre-Landing Review: N issues — M auto-fixed, K asked (J fixed, L skipped)If no issues found:
Pre-Landing Review: No issues found. -
Persist the review result to the review log:
.github-gstack-intelligence/state/results/review/review-log.json '{"skill":"review","timestamp":"TIMESTAMP","status":"STATUS","issues_found":N,"critical":N,"informational":N,"quality_score":SCORE,"specialists":SPECIALISTS_JSON,"findings":FINDINGS_JSON,"commit":"'"$(git rev-parse --short HEAD)"'","via":"ship"}'Substitute TIMESTAMP (ISO 8601), STATUS ("clean" if no issues, "issues_found" otherwise),
and N values from the summary counts above. The via:"ship" distinguishes from standalone /review runs.
quality_score= the PR Quality Score computed in Step 9.2 (e.g., 7.5). If specialists were skipped (small diff), use10.0specialists= the per-specialist stats object compiled in Step 9.2. Each specialist that was considered gets an entry:{"dispatched":true/false,"findings":N,"critical":N,"informational":N}if dispatched, or{"dispatched":false,"reason":"scope|gated"}if skipped. Example:{"testing":{"dispatched":true,"findings":2,"critical":0,"informational":2},"security":{"dispatched":false,"reason":"scope"}}findings= array of per-finding records. For each finding (from checklist pass and specialists), include:{"fingerprint":"path:line:category","severity":"CRITICAL|INFORMATIONAL","action":"ACTION"}. ACTION is"auto-fixed","fixed"(user approved), or"skipped"(user chose Skip).
Save the review output — it goes into the PR body in Step 19.
Dispatch the fetch + classification as a subagent using the Agent tool with subagent_type: "general-purpose". The subagent pulls every Greptile comment, runs the escalation detection algorithm, and classifies each comment. Parent receives a structured list and handles user interaction + file edits.
Subagent prompt:
You are classifying Greptile review comments for a /ship workflow. Read
.claude/skills/review/greptile-triage.mdand follow the fetch, filter, classify, and escalation detection steps. Do NOT fix code, do NOT reply to comments, do NOT commit — report only.For each comment, assign:
classification(valid_actionable,already_fixed,false_positive,suppressed),escalation_tier(1 or 2), the file:line or [top-level] tag, body summary, and permalink URL.If no PR exists,
ghfails, the API errors, or there are zero comments, output:{"total":0,"comments":[]}and stop.Otherwise, output a single JSON object on the LAST LINE of your response:
{"total":N,"comments":[{"classification":"...","escalation_tier":N,"ref":"file:line","summary":"...","permalink":"url"},...]}
Parent processing:
Parse the LAST line as JSON.
If total is 0, skip this step silently. Continue to Step 12.
Otherwise, print: + {total} Greptile comments ({valid_actionable} valid, {already_fixed} already fixed, {false_positive} FP).
For each comment in comments:
VALID & ACTIONABLE: Use GitHub follow-up comment with:
- The comment (file:line or [top-level] + body summary + permalink URL)
RECOMMENDATION: Choose A because [one-line reason]- Options: A) Fix now, B) Acknowledge and ship anyway, C) It's a false positive
- If user chooses A: apply the fix, commit the fixed files (
git add <fixed-files> && git commit -m "fix: address Greptile review — <brief description>"), reply using the Fix reply template from greptile-triage.md (include inline diff + explanation), and save to both per-project and global greptile-history (type: fix). - If user chooses C: reply using the False Positive reply template from greptile-triage.md (include evidence + suggested re-rank), save to both per-project and global greptile-history (type: fp).
VALID BUT ALREADY FIXED: Reply using the Already Fixed reply template from greptile-triage.md — no GitHub follow-up comment needed:
- Include what was done and the fixing commit SHA
- Save to both per-project and global greptile-history (type: already-fixed)
FALSE POSITIVE: Use GitHub follow-up comment:
- Show the comment and why you think it's wrong (file:line or [top-level] + body summary + permalink URL)
- Options:
- A) Reply to Greptile explaining the false positive (recommended if clearly wrong)
- B) Fix it anyway (if trivial)
- C) Ignore silently
- If user chooses A: reply using the False Positive reply template from greptile-triage.md (include evidence + suggested re-rank), save to both per-project and global greptile-history (type: fp)
SUPPRESSED: Skip silently — these are known false positives from previous triage.
After all comments are resolved: If any fixes were applied, the tests from Step 5 are now stale. Re-run tests (Step 5) before continuing to Step 12. If no fixes were applied, continue to Step 12.
Before finalizing findings, do one adversarial pass that tries to disprove each claim using the checked-out code and current GitHub context.
Persist durable outcomes in .github-gstack-intelligence/state/results/ when the lifecycle layer is ready to store them.
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
Worked examples (ship-specific): good keywords are learnings-search, pacing, worktree-ship. Bad: the branch headline, v1.31.1.0, feat: token-or search.
.github-gstack-intelligence/skills/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || trueIf any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
Idempotency check: Before bumping, classify the state by comparing VERSION against the base branch AND against package.json's version field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
if ! git rev-parse --verify origin/<base> >/dev/null 2>&1; then
echo "ERROR: Unable to resolve origin/<base>. Run 'git fetch origin' or verify the base branch exists."
exit 1
fi
BASE_VERSION=$(git show origin/<base>:VERSION 2>/dev/null | tr -d '\r\n[:space:]' || echo "0.0.0.0")
CURRENT_VERSION=$(cat VERSION 2>/dev/null | tr -d '\r\n[:space:]' || echo "0.0.0.0")
[ -z "$BASE_VERSION" ] && BASE_VERSION="0.0.0.0"
[ -z "$CURRENT_VERSION" ] && CURRENT_VERSION="0.0.0.0"
PKG_VERSION=""
PKG_EXISTS=0
if [ -f package.json ]; then
PKG_EXISTS=1
if command -v node >/dev/null 2>&1; then
PKG_VERSION=$(node -e 'const p=require("./package.json");process.stdout.write(p.version||"")' 2>/dev/null)
PARSE_EXIT=$?
elif command -v bun >/dev/null 2>&1; then
PKG_VERSION=$(bun -e 'const p=require("./package.json");process.stdout.write(p.version||"")' 2>/dev/null)
PARSE_EXIT=$?
else
echo "ERROR: package.json exists but neither node nor bun is available. Install one and re-run."
exit 1
fi
if [ "$PARSE_EXIT" != "0" ]; then
echo "ERROR: package.json is not valid JSON. Fix the file before re-running /ship."
exit 1
fi
fi
echo "BASE: $BASE_VERSION VERSION: $CURRENT_VERSION package.json: ${PKG_VERSION:-<none>}"
if [ "$CURRENT_VERSION" = "$BASE_VERSION" ]; then
if [ "$PKG_EXISTS" = "1" ] && [ -n "$PKG_VERSION" ] && [ "$PKG_VERSION" != "$CURRENT_VERSION" ]; then
echo "STATE: DRIFT_UNEXPECTED"
echo "package.json version ($PKG_VERSION) disagrees with VERSION ($CURRENT_VERSION) while VERSION matches base."
echo "This looks like a manual edit to package.json bypassing /ship. Reconcile manually, then re-run."
exit 1
fi
echo "STATE: FRESH"
else
if [ "$PKG_EXISTS" = "1" ] && [ -n "$PKG_VERSION" ] && [ "$PKG_VERSION" != "$CURRENT_VERSION" ]; then
echo "STATE: DRIFT_STALE_PKG"
else
echo "STATE: ALREADY_BUMPED"
fi
fiRead the STATE: line and dispatch:
- FRESH → proceed with the bump action below (steps 1–4).
- ALREADY_BUMPED → skip the bump by default, BUT check for queue drift first: call
bin/gstack-next-versionwith the implied bump level (derived fromCURRENT_VERSIONvsBASE_VERSION), compare its.versionagainstCURRENT_VERSION. If they differ (queue moved since last ship), use GitHub follow-up comment: "VERSION drift detected: you claim v but next available is v (queue moved). A) Rebump to v and rewrite CHANGELOG header + PR title (recommended), B) Keep v — will be rejected by CI version-gate until resolved." If A, treat this as FRESH withNEW_VERSION=<new>and run steps 1-4 (which will also trigger Step 13 CHANGELOG header rewrite and Step 19 PR title rewrite). If B, reuseCURRENT_VERSIONand warn that CI will likely reject. If util is offline, warn and reuseCURRENT_VERSION. - DRIFT_STALE_PKG → a prior
/shipbumpedVERSIONbut failed to updatepackage.json. Run the sync-only repair block below (after step 4). Do NOT re-bump. ReuseCURRENT_VERSIONfor CHANGELOG and PR body. (Queue check still runs in ALREADY_BUMPED terms after repair.) - DRIFT_UNEXPECTED →
/shiphas halted (exit 1). Resolve manually; /ship cannot tell which file is authoritative.
-
Read the current
VERSIONfile (4-digit format:MAJOR.MINOR.PATCH.MICRO) -
Auto-decide the bump level based on the diff:
- Count lines changed (
git diff origin/<base>...HEAD --stat | tail -1) - Check for feature signals: new route/page files (e.g.
app/*/page.tsx,pages/*.ts), new DB migration/schema files, new test files alongside new source files, or branch name starting withfeat/ - MICRO (4th digit): < 50 lines changed, trivial tweaks, typos, config
- PATCH (3rd digit): 50+ lines changed, no feature signals detected
- MINOR (2nd digit): ASK the user if ANY feature signal is detected, OR 500+ lines changed, OR new modules/packages added
- MAJOR (1st digit): ASK the user — only for milestones or breaking changes
Save the chosen level as
BUMP_LEVEL(one ofmajor,minor,patch,micro). This is the user-intended level. The next step decides placement — the level stays the same even if queue-aware allocation has to advance past a claimed slot. - Count lines changed (
-
Queue-aware version pick (workspace-aware ship, v1.6.4.0+). Call
bin/gstack-next-versionto see what's already claimed by open PRs + active sibling Conductor worktrees, then render the queue state to the user:QUEUE_JSON=$(bun run bin/gstack-next-version \ --base <base> \ --bump "$BUMP_LEVEL" \ --current-version "$BASE_VERSION" 2>/dev/null || echo '{"offline":true}') NEW_VERSION=$(echo "$QUEUE_JSON" | jq -r '.version // empty') CLAIMED_COUNT=$(echo "$QUEUE_JSON" | jq -r '.claimed | length') ACTIVE_SIBLING_COUNT=$(echo "$QUEUE_JSON" | jq -r '.active_siblings | length') OFFLINE=$(echo "$QUEUE_JSON" | jq -r '.offline // false') REASON=$(echo "$QUEUE_JSON" | jq -r '.reason // ""')
- If
OFFLINE=trueor the util fails (auth expired, nogh/glab, network): fall back to localBUMP_LEVELarithmetic (bumpBASE_VERSIONat the chosen level). Print⚠ workspace-aware ship offline — using local bump only. Continue. - If
CLAIMED_COUNT > 0: render the queue table to the user so they can see landing order at a glance:Queue on <base> (vBASE_VERSION): #<pr> <branch> → v<version> [⚠ collision with #<other>] Active sibling workspaces (WIP, not yet PR'd): <path> → v<version> (committed Nh ago) Your branch will claim: vNEW_VERSION (<reason>) - If
ACTIVE_SIBLING_COUNT > 0and any active sibling's VERSION is>= NEW_VERSION, use GitHub follow-up comment: "Sibling workspace has v committed h ago but hasn't PR'd yet. Wait for them to ship first, or advance past? A) Advance past (recommended for unrelated work), B) Abort /ship and sync up with sibling first." - Validate
NEW_VERSIONmatchesMAJOR.MINOR.PATCH.MICRO. If util returns an empty or malformed version, fall back to local bump.
- If
-
Validate
NEW_VERSIONand write it to bothVERSIONandpackage.json. This block runs only whenSTATE: FRESH.
if ! printf '%s' "$NEW_VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "ERROR: NEW_VERSION ($NEW_VERSION) does not match MAJOR.MINOR.PATCH.MICRO pattern. Aborting."
exit 1
fi
echo "$NEW_VERSION" > VERSION
if [ -f package.json ]; then
if command -v node >/dev/null 2>&1; then
node -e 'const fs=require("fs"),p=require("./package.json");p.version=process.argv[1];fs.writeFileSync("package.json",JSON.stringify(p,null,2)+"\n")' "$NEW_VERSION" || {
echo "ERROR: failed to update package.json. VERSION was written but package.json is now stale. Fix and re-run — the new idempotency check will detect the drift."
exit 1
}
elif command -v bun >/dev/null 2>&1; then
bun -e 'const fs=require("fs"),p=require("./package.json");p.version=process.argv[1];fs.writeFileSync("package.json",JSON.stringify(p,null,2)+"\n")' "$NEW_VERSION" || {
echo "ERROR: failed to update package.json. VERSION was written but package.json is now stale."
exit 1
}
else
echo "ERROR: package.json exists but neither node nor bun is available."
exit 1
fi
fiDRIFT_STALE_PKG repair path — runs when idempotency reports STATE: DRIFT_STALE_PKG. No re-bump; sync package.json.version to the current VERSION and continue. Reuse CURRENT_VERSION for CHANGELOG and PR body.
REPAIR_VERSION=$(cat VERSION | tr -d '\r\n[:space:]')
if ! printf '%s' "$REPAIR_VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "ERROR: VERSION file contents ($REPAIR_VERSION) do not match MAJOR.MINOR.PATCH.MICRO pattern. Refusing to propagate invalid semver into package.json. Fix VERSION manually, then re-run /ship."
exit 1
fi
if command -v node >/dev/null 2>&1; then
node -e 'const fs=require("fs"),p=require("./package.json");p.version=process.argv[1];fs.writeFileSync("package.json",JSON.stringify(p,null,2)+"\n")' "$REPAIR_VERSION" || {
echo "ERROR: drift repair failed — could not update package.json."
exit 1
}
else
bun -e 'const fs=require("fs"),p=require("./package.json");p.version=process.argv[1];fs.writeFileSync("package.json",JSON.stringify(p,null,2)+"\n")' "$REPAIR_VERSION" || {
echo "ERROR: drift repair failed."
exit 1
}
fi
echo "Drift repaired: package.json synced to $REPAIR_VERSION. No version bump performed."Generate CHANGELOG entries from the diff and commit history. Use the repository's existing CHANGELOG format if one exists.
Cross-reference the project's TODOS.md against the changes being shipped. Mark completed items automatically; prompt only if the file is missing or disorganized.
Read .claude/skills/review/TODOS-format.md for the canonical format reference.
1. Check if TODOS.md exists in the repository root.
If TODOS.md does not exist: Use GitHub follow-up comment:
- Message: "GStack recommends maintaining a TODOS.md organized by skill/component, then priority (P0 at top through P4, then Completed at bottom). See TODOS-format.md for the full format. Would you like to create one?"
- Options: A) Create it now, B) Skip for now
- If A: Create
TODOS.mdwith a skeleton (# TODOS heading + ## Completed section). Continue to step 3. - If B: Skip the rest of Step 14. Continue to Step 15.
2. Check structure and organization:
Read TODOS.md and verify it follows the recommended structure:
- Items grouped under
## <Skill/Component>headings - Each item has
**Priority:**field with P0-P4 value - A
## Completedsection at the bottom
If disorganized (missing priority fields, no component groupings, no Completed section): Use GitHub follow-up comment:
- Message: "TODOS.md doesn't follow the recommended structure (skill/component groupings, P0-P4 priority, Completed section). Would you like to reorganize it?"
- Options: A) Reorganize now (recommended), B) Leave as-is
- If A: Reorganize in-place following TODOS-format.md. Preserve all content — only restructure, never delete items.
- If B: Continue to step 3 without restructuring.
3. Detect completed TODOs:
This step is fully automatic — no user interaction.
Use the diff and commit history already gathered in earlier steps:
git diff <base>...HEAD(full diff against the base branch)git log <base>..HEAD --oneline(all commits being shipped)
For each TODO item, check if the changes in this PR complete it by:
- Matching commit messages against the TODO title and description
- Checking if files referenced in the TODO appear in the diff
- Checking if the TODO's described work matches the functional changes
Be conservative: Only mark a TODO as completed if there is clear evidence in the diff. If uncertain, leave it alone.
4. Move completed items to the ## Completed section at the bottom. Append: **Completed:** vX.Y.Z (YYYY-MM-DD)
5. Output summary:
TODOS.md: N items marked complete (item1, item2, ...). M items remaining.- Or:
TODOS.md: No completed items detected. M items remaining. - Or:
TODOS.md: Created./TODOS.md: Reorganized.
6. Defensive: If TODOS.md cannot be written (permission error, disk full), warn the user and continue. Never stop the ship workflow for a TODOS failure.
Save this summary — it goes into the PR body in Step 19.
If CHECKPOINT_MODE is "continuous", the branch likely contains WIP: commits
from auto-checkpointing. These must be squashed INTO the corresponding logical
commits before the bisectable-grouping logic in Step 15.1 runs. Non-WIP commits
on the branch (earlier landed work) must be preserved.
Detection:
WIP_COUNT=$(git log <base>..HEAD --oneline --grep="^WIP:" 2>/dev/null | wc -l | tr -d ' ')
echo "WIP_COMMITS: $WIP_COUNT"If WIP_COUNT is 0: skip this sub-step entirely.
If WIP_COUNT > 0, collect the WIP context first so it survives the squash:
# Export [gstack-context] blocks from all WIP commits on this branch.
# This file becomes input to the CHANGELOG entry and may inform PR body context.
mkdir -p "$(git rev-parse --show-toplevel)/.gstack"
git log <base>..HEAD --grep="^WIP:" --format="%H%n%B%n---END---" > \
"$(git rev-parse --show-toplevel)/.github-gstack-intelligence/state/local/wip-context-before-squash.md" 2>/dev/null || trueNon-destructive squash strategy:
git reset --soft <merge-base> WOULD uncommit everything including non-WIP commits.
DO NOT DO THAT. Instead, use git rebase scoped to filter WIP commits only.
Option 1 (preferred, if there are non-WIP commits mixed in):
# Interactive rebase with automated WIP squashing.
# Mark every WIP commit as 'fixup' (drop its message, fold changes into prior commit).
git rebase -i $(git merge-base HEAD origin/<base>) \
--exec 'true' \
-X ours 2>/dev/null || {
echo "Rebase conflict. Aborting: git rebase --abort"
git rebase --abort
echo "STATUS: BLOCKED — manual WIP squash required"
exit 1
}Option 2 (simpler, if the branch is ALL WIP commits so far — no landed work):
# Branch contains only WIP commits. Reset-soft is safe here because there's
# nothing non-WIP to preserve. Verify first.
NON_WIP=$(git log <base>..HEAD --oneline --invert-grep --grep="^WIP:" 2>/dev/null | wc -l | tr -d ' ')
if [ "$NON_WIP" -eq 0 ]; then
git reset --soft $(git merge-base HEAD origin/<base>)
echo "WIP-only branch, reset-soft to merge base. Step 15.1 will create clean commits."
fiDecide at runtime which option applies. If unsure, prefer stopping and asking the user via GitHub follow-up comment rather than destroying non-WIP commits.
Anti-footgun rules:
- NEVER blind
git reset --softif there are non-WIP commits. Codex flagged this as destructive — it would uncommit real landed work and turn the push step into a non-fast-forward push for anyone who already pushed. - Only proceed to Step 15.1 after WIP commits are successfully squashed/absorbed or the branch has been verified to contain only WIP work.
Goal: Create small, logical commits that work well with git bisect and help LLMs understand what changed.
-
Analyze the diff and group changes into logical commits. Each commit should represent one coherent change — not one file, but one logical unit.
-
Commit ordering (earlier commits first):
- Infrastructure: migrations, config changes, route additions
- Models & services: new models, services, concerns (with their tests)
- Controllers & views: controllers, views, JS/React components (with their tests)
- VERSION + CHANGELOG + TODOS.md: always in the final commit
-
Rules for splitting:
- A model and its test file go in the same commit
- A service and its test file go in the same commit
- A controller, its views, and its test go in the same commit
- Migrations are their own commit (or grouped with the model they support)
- Config/route changes can group with the feature they enable
- If the total diff is small (< 50 lines across < 4 files), a single commit is fine
-
Each commit must be independently valid — no broken imports, no references to code that doesn't exist yet. Order commits so dependencies come first.
-
Compose each commit message:
- First line:
<type>: <summary>(type = feat/fix/chore/refactor/docs) - Body: brief description of what this commit contains
- Only the final commit (VERSION + CHANGELOG) gets the version tag and co-author trailer:
- First line:
git commit -m "$(cat <<'EOF'
chore: bump version and changelog (vX.Y.Z.W)
Co-authored-by: github-gstack-intelligence[bot] <github-gstack-intelligence[bot]@users.noreply.github.com>
EOF
)"IRON LAW: NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.
Before pushing, re-verify if code changed during Steps 4-6:
-
Test verification: If ANY code changed after Step 5's test run (fixes from review findings, CHANGELOG edits don't count), re-run the test suite. Paste fresh output. Stale output from Step 5 is NOT acceptable.
-
Build verification: If the project has a build step, run it. Paste output.
-
Rationalization prevention:
- "Should work now" → RUN IT.
- "I'm confident" → Confidence is not evidence.
- "I already tested earlier" → Code changed since then. Test again.
- "It's a trivial change" → Trivial changes break production.
If tests fail here: STOP. Do not push. Fix the issue and return to Step 5.
Claiming work is complete without verification is dishonesty, not efficiency.
Idempotency check: Check if the branch is already pushed and up to date.
git fetch origin <branch-name> 2>/dev/null
LOCAL=$(git rev-parse HEAD)
REMOTE=$(git rev-parse origin/<branch-name> 2>/dev/null || echo "none")
echo "LOCAL: $LOCAL REMOTE: $REMOTE"
[ "$LOCAL" = "$REMOTE" ] && echo "ALREADY_PUSHED" || echo "PUSH_NEEDED"If ALREADY_PUSHED, skip the push but continue to Step 18. Otherwise push with upstream tracking:
git push -u origin <branch-name>You are NOT done. The code is pushed but documentation sync and PR creation are mandatory final steps. Continue to Step 18.
Dispatch /document-release as a subagent using the Agent tool with subagent_type: "general-purpose". The subagent gets a fresh context window — zero rot from the preceding 17 steps. It also runs the full /document-release workflow (with CHANGELOG clobber protection, doc exclusions, risky-change gates, named staging, race-safe PR body editing) rather than a weaker reimplementation.
Sequencing: This step runs AFTER Step 17 (Push) and BEFORE Step 19 (Create PR). The PR is created once from final HEAD with the ## Documentation section baked into the initial body. No create-then-re-edit dance.
Subagent prompt:
You are executing the /document-release workflow after a code push. Read the full skill file
${HOME}/.claude/skills/gstack/document-release/SKILL.mdand execute its complete workflow end-to-end, including CHANGELOG clobber protection, doc exclusions, risky-change gates, and named staging. Do NOT attempt to edit the PR body — no PR exists yet. Branch:<branch>, base:<base>.After completing the workflow, output a single JSON object on the LAST LINE of your response (no other text after it):
{"files_updated":["README.md","CLAUDE.md",...],"commit_sha":"abc1234","pushed":true,"documentation_section":"<markdown block for PR body's ## Documentation section>"}If no documentation files needed updating, output:
{"files_updated":[],"commit_sha":null,"pushed":false,"documentation_section":null}
Parent processing:
- Parse the LAST line of the subagent's output as JSON.
- Store
documentation_section— Step 19 embeds it in the PR body (or omits the section if null). - If
files_updatedis non-empty, print:Documentation synced: {files_updated.length} files updated, committed as {commit_sha}. - If
files_updatedis empty, print:Documentation is current — no updates needed.
If the subagent fails or returns invalid JSON: Print a warning and proceed to Step 19 without a ## Documentation section. Do not block /ship on subagent failure. The user can run /document-release manually after the PR lands.
Idempotency check: Check if a PR/MR already exists for this branch.
If GitHub:
gh pr view --json url,number,state -q 'if .state == "OPEN" then "PR #\(.number): \(.url)" else "NO_PR" end' 2>/dev/null || echo "NO_PR"If GitLab:
glab mr view -F json 2>/dev/null | jq -r 'if .state == "opened" then "MR_EXISTS" else "NO_MR" end' 2>/dev/null || echo "NO_MR"If an open PR/MR already exists: update the PR body using gh pr edit --body "..." (GitHub) or glab mr update -d "..." (GitLab). Always regenerate the PR body from scratch using this run's fresh results (test output, coverage audit, review findings, adversarial review, TODOS summary, documentation_section from Step 18). Never reuse stale PR body content from a prior run.
Always update the PR title to start with v$NEW_VERSION. PR titles use the workspace-aware format v<NEW_VERSION> <type>: <summary> — version ALWAYS first, no exceptions, no "custom title kept intentionally" escape hatch. The shared helper bin/gstack-pr-title-rewrite.sh is the single source of truth for the rule.
- Read the current title:
CURRENT=$(gh pr view --json title -q .title)(orglab mr view -F json | jq -r .title). - Compute the corrected title:
NEW_TITLE=$(.github-gstack-intelligence/skills/bin/gstack-pr-title-rewrite.sh "$NEW_VERSION" "$CURRENT"). The helper handles three cases: title already correct (no-op), title has a differentv<X.Y.Z.W>prefix (replace it), or title has no version prefix (prepend one). - If
NEW_TITLEdiffers fromCURRENT, rungh pr edit --title "$NEW_TITLE"(orglab mr update -t "$NEW_TITLE"). - Self-check: re-fetch the title and assert it starts with
v$NEW_VERSION. If it does not, retry the edit once. If still wrong, surface the failure to the user.
This keeps the title truthful when Step 12's queue-drift detection rebumps a stale version, and forces the format on PRs that were created without it.
Print the existing URL and continue to Step 20.
If no PR/MR exists: create a pull request (GitHub) or merge request (GitLab) using the platform detected in Step 0.
The PR/MR body should contain these sections:
## Summary
<Summarize ALL changes being shipped. Run `git log <base>..HEAD --oneline` to enumerate
every commit. Exclude the VERSION/CHANGELOG metadata commit (that's this PR's bookkeeping,
not a substantive change). Group the remaining commits into logical sections (e.g.,
"**Performance**", "**Dead Code Removal**", "**Infrastructure**"). Every substantive commit
must appear in at least one section. If a commit's work isn't reflected in the summary,
you missed it.>
## Test Coverage
<coverage diagram from Step 7, or "All new code paths have test coverage.">
<If Step 7 ran: "Tests: {before} → {after} (+{delta} new)">
## Pre-Landing Review
<findings from Step 9 code review, or "No issues found.">
## Design Review
<If design review ran: "Design Review (lite): N findings — M auto-fixed, K skipped. AI Slop: clean/N issues.">
<If no frontend files changed: "No frontend files changed — design review skipped.">
## Eval Results
<If evals ran: suite names, pass/fail counts, cost dashboard summary. If skipped: "No prompt-related files changed — evals skipped.">
## Greptile Review
<If Greptile comments were found: bullet list with [FIXED] / [FALSE POSITIVE] / [ALREADY FIXED] tag + one-line summary per comment>
<If no Greptile comments found: "No Greptile comments.">
<If no PR existed during Step 10: omit this section entirely>
## Scope Drift
<If scope drift ran: "Scope Check: CLEAN" or list of drift/creep findings>
<If no scope drift: omit this section>
## Plan Completion
<If plan file found: completion checklist summary from Step 8>
<If no plan file: "No plan file detected.">
<If plan items deferred: list deferred items>
## Verification Results
<If verification ran: summary from Step 8.1 (N PASS, M FAIL, K SKIPPED)>
<If skipped: reason (no plan, no server, no verification section)>
<If not applicable: omit this section>
## TODOS
<If items marked complete: bullet list of completed items with version>
<If no items completed: "No TODO items completed in this PR.">
<If TODOS.md created or reorganized: note that>
<If TODOS.md doesn't exist and user skipped: omit this section>
## Documentation
<Embed the `documentation_section` string returned by Step 18's subagent here, verbatim.>
<If Step 18 returned `documentation_section: null` (no docs updated), omit this section entirely.>
## Test plan
- [x] All Rails tests pass (N runs, 0 failures)
- [x] All Vitest tests pass (N tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
If GitHub:
# PR title MUST start with v$NEW_VERSION — enforced on every run, no exceptions.
# (See Step 19 idempotency block + bin/gstack-pr-title-rewrite.sh for the rule.)
gh pr create --base <base> --title "v$NEW_VERSION <type>: <summary>" --body "$(cat <<'EOF'
<PR body from above>
EOF
)"If GitLab:
# MR title MUST start with v$NEW_VERSION — enforced on every run, no exceptions.
# (See Step 19 idempotency block + bin/gstack-pr-title-rewrite.sh for the rule.)
glab mr create -b <base> -t "v$NEW_VERSION <type>: <summary>" -d "$(cat <<'EOF'
<MR body from above>
EOF
)"If neither CLI is available: Print the branch name, remote URL, and instruct the user to create the PR/MR manually via the web UI. Do not stop — the code is pushed and ready.
Output the PR/MR URL — then proceed to Step 20.
Log coverage and plan completion data so /retro can track trends:
eval "$(basename "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" 2>/dev/null)" && mkdir -p .github-gstack-intelligence/state/results/$SLUGAppend to .github-gstack-intelligence/state/results/$SLUG/$BRANCH-reviews.jsonl:
echo '{"skill":"ship","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","coverage_pct":COVERAGE_PCT,"plan_items_total":PLAN_TOTAL,"plan_items_done":PLAN_DONE,"verification_result":"VERIFY_RESULT","version":"VERSION","branch":"BRANCH"}' >> .github-gstack-intelligence/state/results/$SLUG/$BRANCH-reviews.jsonlSubstitute from earlier steps:
- COVERAGE_PCT: coverage percentage from Step 7 diagram (integer, or -1 if undetermined)
- PLAN_TOTAL: total plan items extracted in Step 8 (0 if no plan file)
- PLAN_DONE: count of DONE + CHANGED items from Step 8 (0 if no plan file)
- VERIFY_RESULT: "pass", "fail", or "skipped" from Step 8.1
- VERSION: from the VERSION file
- BRANCH: current branch name
This step is automatic — never skip it, never ask for confirmation.
- Never skip tests. If tests fail, stop.
- Never skip the pre-landing review. If checklist.md is unreadable, stop.
- Never force push. Use regular
git pushonly. - Never ask for trivial confirmations (e.g., "ready to push?", "create PR?"). DO stop for: version bumps (MINOR/MAJOR), pre-landing review findings (ASK items), and Codex structured review [P1] findings (large diffs only).
- Always use the 4-digit version format from the VERSION file.
- Date format in CHANGELOG:
YYYY-MM-DD - Split commits for bisectability — each commit = one logical change.
- TODOS.md completion detection must be conservative. Only mark items as completed when the diff clearly shows the work is done.
- Use Greptile reply templates from greptile-triage.md. Every reply includes evidence (inline diff, code references, re-rank suggestion). Never post vague replies.
- Never push without fresh verification evidence. If code changed after Step 5 tests, re-run before pushing.
- Step 7 generates coverage tests. They must pass before committing. Never commit failing tests.
- The goal is: user says
/ship, next thing they see is the review + PR URL + auto-synced docs.