Unify gapp setup and gapp ci setup under a single CLI surface
Problem
The bootstrap flow for a new project today requires two local commands run in a specific order, enforced only by an error message:
gapp init # local — write gapp.yaml
gapp setup --project PROJECT_ID # local, as Owner — APIs, bucket, label
gapp ci setup # local, as Owner — WIF, deploy SA, workflow
If a user runs gapp ci setup before gapp setup, setup_ci() fails at the project-resolution step (gapp/admin/sdk/ci.py:614-617) with the error:
No GCP project found for '': ...
Run 'gapp setup --project ' first.
That error message is the only mechanism enforcing the ordering. The two commands have no code-level coupling — gapp ci setup does not invoke setup() internally.
What this design gets right
- Separation of concerns is clean:
gapp setup provisions the GCP foundation (APIs, bucket, labels, build SA perms). gapp ci setup provisions the CI bootstrap (WIF pool/provider, deploy SA, IAM bindings, GitHub workflow file). These are genuinely different concerns.
- Idempotency works in both directions.
gapp setup re-runs harmlessly on every CI deploy. gapp ci setup can be re-run to add a binding or refresh the workflow file.
- CI's
gapp setup runs as the deploy SA with intentionally narrow IAM. enable_api() silently no-ops on PERMISSION_DENIED because the deploy SA doesn't have serviceusage.serviceUsageAdmin (broad and dangerous). The local-Owner-first pattern is the deliberate way to keep CI's blast radius small while still letting the framework own API enablement.
Where it gets uncomfortable
The "labels must exist before ci setup runs" dependency is documented in the error message. But there are hidden dependencies the message doesn't surface:
-
API enablement: gapp ci setup calls _get_project_number() which runs gcloud projects describe, which requires cloudresourcemanager.googleapis.com to be enabled on the target project. This API is only enabled when gapp setup runs (it's in the foundation API list). If gapp ci setup is run on a project where the API hasn't been enabled, the failure is opaque (gcloud returns a non-zero exit, and the wrapping subprocess exception doesn't preserve stderr at the call site).
-
Bucket for terraform state: gapp setup creates the GCS bucket that gapp deploy later uses for terraform state. gapp ci setup doesn't need it itself, but the subsequent CI deploy will fail without it.
-
Build SA permissions: ensure_build_permissions() runs during gapp setup to grant the Cloud Build SA the perms it needs. CI deploys need those grants in place.
So the implicit contract is bigger than "labels first" — it's really "all the foundation gapp setup provides, first." The user has no way to discover this from the error message alone; they just have to know to run gapp setup first.
Proposed direction
Collapse the two commands into a single CLI surface, with CI provisioning as an opt-in scope:
gapp setup --project PROJECT_ID # foundation only (current `gapp setup` behavior)
gapp setup --project PROJECT_ID --ci # foundation + CI bootstrap
The --ci path runs everything setup does today, then layers the CI-specific provisioning (WIF, deploy SA, IAM, workflow) on top. Internally the logic stays modular (functions can still be named setup_foundation / setup_ci and called in sequence), but the user-facing surface becomes one command with one clear ordering: there's no second command to forget, and no error message playing the role of contract enforcer.
This also addresses the hidden-dependency leak: any prerequisite the CI bootstrap needs (current or future) is automatically satisfied because foundation setup always runs first under --ci.
Why --ci as a flag rather than a positional or subcommand
- A subcommand (
gapp setup ci) reads as a sub-operation of setup, which is fine, but it still encourages the mental model of "two steps." A flag reads as "same operation, broader scope."
- A positional arg conflates with
--project semantics and is less discoverable.
- Other flags (
--env, --force) already exist; --ci slots into the same pattern.
Open question: whether --ci should require explicit opt-in or default-on-when-detected (e.g., if a CI repo is configured via gapp ci init). Default-off is safer.
Tradeoffs
- Breaking CLI change. Any existing caller of
gapp ci setup (scripts, docs, agent skills) needs to migrate to gapp setup --ci. The error message from the deprecated gapp ci setup command can guide users for one major-version cycle, then be removed.
- Documentation and skill updates. All references to
gapp ci setup in docs, skill args: descriptions, agent-context files, and the README would need updating.
- Refactoring scope.
setup_ci() in gapp/admin/sdk/ci.py calls into GappSDK for project resolution and re-uses naming conventions. The actual extraction into a setup(..., ci=False) form on GappSDK is straightforward — the function bodies already exist and can be composed.
- Not version-pegged. This proposal isn't tied to any specific gapp release. The right home depends on what other breaking changes are coalescing — could ship standalone in the next major, or batched with other CLI-surface cleanups when there's a forcing function.
Alternative considered: leave the split, improve the error message
Add the API enablement and bucket dependencies to the setup_ci() error message, so users hitting it learn the full scope of what gapp setup provides. This is a smaller change but doesn't address the underlying "two commands, implicit ordering" usability cost. A new user still has to read the error, do the right thing, then come back. The unified command removes the need to discover the dependency at all.
Work breakdown
Unify
gapp setupandgapp ci setupunder a single CLI surfaceProblem
The bootstrap flow for a new project today requires two local commands run in a specific order, enforced only by an error message:
If a user runs
gapp ci setupbeforegapp setup,setup_ci()fails at the project-resolution step (gapp/admin/sdk/ci.py:614-617) with the error:That error message is the only mechanism enforcing the ordering. The two commands have no code-level coupling —
gapp ci setupdoes not invokesetup()internally.What this design gets right
gapp setupprovisions the GCP foundation (APIs, bucket, labels, build SA perms).gapp ci setupprovisions the CI bootstrap (WIF pool/provider, deploy SA, IAM bindings, GitHub workflow file). These are genuinely different concerns.gapp setupre-runs harmlessly on every CI deploy.gapp ci setupcan be re-run to add a binding or refresh the workflow file.gapp setupruns as the deploy SA with intentionally narrow IAM.enable_api()silently no-ops onPERMISSION_DENIEDbecause the deploy SA doesn't haveserviceusage.serviceUsageAdmin(broad and dangerous). The local-Owner-first pattern is the deliberate way to keep CI's blast radius small while still letting the framework own API enablement.Where it gets uncomfortable
The "labels must exist before ci setup runs" dependency is documented in the error message. But there are hidden dependencies the message doesn't surface:
API enablement:
gapp ci setupcalls_get_project_number()which runsgcloud projects describe, which requirescloudresourcemanager.googleapis.comto be enabled on the target project. This API is only enabled whengapp setupruns (it's in the foundation API list). Ifgapp ci setupis run on a project where the API hasn't been enabled, the failure is opaque (gcloudreturns a non-zero exit, and the wrappingsubprocessexception doesn't preserve stderr at the call site).Bucket for terraform state:
gapp setupcreates the GCS bucket thatgapp deploylater uses for terraform state.gapp ci setupdoesn't need it itself, but the subsequent CI deploy will fail without it.Build SA permissions:
ensure_build_permissions()runs duringgapp setupto grant the Cloud Build SA the perms it needs. CI deploys need those grants in place.So the implicit contract is bigger than "labels first" — it's really "all the foundation gapp setup provides, first." The user has no way to discover this from the error message alone; they just have to know to run
gapp setupfirst.Proposed direction
Collapse the two commands into a single CLI surface, with CI provisioning as an opt-in scope:
The
--cipath runs everythingsetupdoes today, then layers the CI-specific provisioning (WIF, deploy SA, IAM, workflow) on top. Internally the logic stays modular (functions can still be namedsetup_foundation/setup_ciand called in sequence), but the user-facing surface becomes one command with one clear ordering: there's no second command to forget, and no error message playing the role of contract enforcer.This also addresses the hidden-dependency leak: any prerequisite the CI bootstrap needs (current or future) is automatically satisfied because foundation setup always runs first under
--ci.Why
--cias a flag rather than a positional or subcommandgapp setup ci) reads as a sub-operation of setup, which is fine, but it still encourages the mental model of "two steps." A flag reads as "same operation, broader scope."--projectsemantics and is less discoverable.--env,--force) already exist;--cislots into the same pattern.Open question: whether
--cishould require explicit opt-in or default-on-when-detected (e.g., if a CI repo is configured viagapp ci init). Default-off is safer.Tradeoffs
gapp ci setup(scripts, docs, agent skills) needs to migrate togapp setup --ci. The error message from the deprecatedgapp ci setupcommand can guide users for one major-version cycle, then be removed.gapp ci setupin docs, skillargs:descriptions, agent-context files, and the README would need updating.setup_ci()ingapp/admin/sdk/ci.pycalls intoGappSDKfor project resolution and re-uses naming conventions. The actual extraction into asetup(..., ci=False)form onGappSDKis straightforward — the function bodies already exist and can be composed.Alternative considered: leave the split, improve the error message
Add the API enablement and bucket dependencies to the
setup_ci()error message, so users hitting it learn the full scope of whatgapp setupprovides. This is a smaller change but doesn't address the underlying "two commands, implicit ordering" usability cost. A new user still has to read the error, do the right thing, then come back. The unified command removes the need to discover the dependency at all.Work breakdown
--ciflag togapp setupCLI (gapp/admin/cli/main.py)setup_ci()ingapp/admin/sdk/ci.pyso the steps after project resolution become callable as a discrete_provision_ci_resources(project_id, ...)functionGappSDK.setup(), after the existing foundation work, conditionally call_provision_ci_resources()whenci=Truegapp ci setupas a deprecated alias for one major-version cycle — print a deprecation notice and forward togapp setup --ci. Remove in the major after that.gapp/admin/sdk/ci.py:614-617error message — the "run gapp setup --project X first" guidance is no longer needed under the unified command. Decide whether to keep it for the deprecation-alias path.README.mdlifecycle examplesCONTRIBUTING.mdif it documents the bootstrap ordergapp:deployskill (its SKILL.md describes the bootstrap flow and referencesgapp ci setupas a separate step)gapp ci setupgapp setup --cipath against the existing_DEPLOY_SA_ROLESprovisioning + WIF binding fixturesgapp ci setupalias still works (during the deprecation window) and forwards correctly