Skip to content

resolve_solution explicit-name path drops repo_path; deploy crashes #42

@krisrowe

Description

@krisrowe

Defect

gapp deploy --solution <name> from inside <name>'s checkout
crashes with:

File ".../gapp/admin/sdk/core.py", line 848, in deploy
    if paths := get_paths(load_manifest(repo_path)):
File ".../gapp/admin/sdk/manifest.py", line 24, in load_manifest
    manifest_path = repo_path / "gapp.yaml"
TypeError: unsupported operand type(s) for /: 'NoneType' and 'str'

Reproduces on v3.0.10 and current main HEAD when running:

cd <solution-repo-checkout>
gapp deploy --solution <name> --ref main

Workaround: drop --solution from the deploy invocation. With no
explicit name, resolve_solution() walks git root + gapp.yaml
and populates repo_path correctly. CI workflows that pass
--solution defensively (the previous .github/workflows/deploy.yml
in this repo did) hit the bug; bare gapp deploy (cwd inference)
does not.

Additional manifestation: gapp secrets get --solution <name>

The same root cause surfaces (with a friendlier error message
rather than a TypeError) when retrieving a secret by name from
outside the solution's checkout:

$ gapp secrets get signing-key --solution gworkspace-access
  Error: No repo path found for this solution.

get_secret()_find_secret() calls
GappSDK().resolve_solution_with_project(solution) — same
explicit-name short-circuit, repo_path=None, raises at
gapp/admin/sdk/secrets.py:437. From inside the solution repo
with no --solution, the cwd-inference path populates
repo_path and the command works.

The proposed fix below resolves both manifestations.

Root cause

Same pattern as #39 (secrets commands seeing project_id=None),
different missing field. GappSDK.resolve_solution() in
gapp/admin/sdk/core.py:249-276 short-circuits when an explicit
name is given:

def resolve_solution(self, name: str | None = None, strict: bool = True):
    if name:
        return {"name": name, "project_id": None, "repo_path": None}
    git_root = self._get_git_root()
    if git_root and (git_root / "gapp.yaml").is_file():
        ...
        return {"name": ..., "project_id": None, "repo_path": str(git_root)}
    return None

The explicit-name path returns repo_path=None unconditionally —
no filesystem discovery, even when the caller is inside a git
tree whose gapp.yaml has the matching solution name. The
inferred-name path DOES populate repo_path. So gapp deploy
works iff the user omits --solution, and breaks iff they pass
it. The CLI / workflow author can't tell the two cases apart
without reading the SDK source.

deploy() at line 848 then unconditionally calls
get_paths(load_manifest(repo_path)) without guarding for None,
yielding the TypeError instead of an actionable message.

Pre-v-3 context.resolve_full_context() populated repo_path
from the cwd whether or not an explicit name was given (it ran
the local-checkout discovery as a separate step that fed into
the returned dict). The v-3 GappSDK consolidation deleted
context.py and moved its responsibilities into
resolve_solution, but the "always try to attach a local
checkout" step was dropped.

This is the third instance of the same v-3 consolidation pattern
this week:

  • secrets commands always see project_id=None — never call resolve_project_for_solution() #39 (closed): secrets commands never saw project_id because
    resolve_solution returns project_id=None always
  • Pre-v-3 setup._enable_api had PERMISSION_DENIED tolerance for
    CI deploys; dropped in v-3 consolidation, restored in
    08b93d0e7f6664efa65977d836262ad5ca3c4e3d ("fix(gcp): tolerate
    PERMISSION_DENIED on services enable in CI deploys")
  • Pre-v-3 context.resolve_full_context() populated
    github_repo for setup_ci; dropped in v-3, restored in
    5ced53d88fdb6f9c09ca553555b02316a28d0979 ("fix(ci): read
    github_repo from local origin remote in setup_ci")

Plus this ticket. Suggests a focused v-3-consolidation-regression
sweep might be worth doing — see "Suggested follow-up" below.

Proposed fix

Populate repo_path on the explicit-name path when the caller is
inside a matching checkout. Mismatched checkout (operator in a
sibling solution's tree) → leave repo_path=None, consistent
with the historical "explicit name = no filesystem assumption"
contract.

def resolve_solution(self, name=None, strict=True):
    git_root = self._get_git_root()
    local_name = None
    local_path = None
    if git_root and (git_root / "gapp.yaml").is_file():
        manifest = load_manifest(git_root, strict=strict)
        local_name = get_solution_name(manifest, git_root)
        local_path = str(git_root)

    if name:
        repo_path = local_path if local_name == name else None
        return {"name": name, "project_id": None, "repo_path": repo_path}

    if local_name:
        return {"name": local_name, "project_id": None, "repo_path": local_path}
    return None

Plus a guard in deploy() at line 848 to raise an actionable
error instead of crashing on a TypeError when repo_path is
None (e.g., explicit name + no matching checkout):

if repo_path is None:
    raise RuntimeError(
        f"Cannot deploy '{solution_name}': no local checkout available. "
        f"Run `gapp deploy` from inside the solution repo's working tree, "
        f"or invoke it with --solution from inside that tree (its "
        f"gapp.yaml's `name` must match)."
    )

Work breakdown

  • Patch resolve_solution to populate repo_path from cwd
    when explicit name matches local gapp.yaml.
  • Add the None-guard in deploy() for the still-no-checkout
    case so the error is actionable.
  • Unit tests: explicit-name + matching cwd populates
    repo_path; explicit-name + mismatched cwd leaves None;
    explicit-name + no git tree leaves None; inferred-name
    behavior unchanged.
  • Update .github/workflows/deploy.yml to re-pass
    --solution in the gapp commands once this is fixed
    (currently dropped as a workaround — see workflow comment).
    Or leave the cwd-inference workflow as-is and document.
  • Sweep callers of resolve_solution for other
    "expects-populated-field-that-isn't" cases. Known so far:
    setup_ci's github_repo (already fixed), secrets'
    project_id (secrets commands always see project_id=None — never call resolve_project_for_solution() #39, closed), this one. Audit
    gapp/admin/sdk/ for sol_ctx.get("...") calls and
    verify each one's expected key is actually populated.

Suggested follow-up — v-3 consolidation regression sweep

Three regressions of the same shape have surfaced in this repo
over a few days (this issue + the two fix commits cited above).
The shape is: "v-3 GappSDK consolidation centralized SDK calls
into core.py and cloud/gcp.py but dropped graceful pre-v-3
behaviors that lived in the deleted modules (context.py,
setup.py)." Each is small to fix in isolation; collectively
they bite operators with confusing crashes that look unrelated.

Consider a dedicated audit: diff the deleted v-2 modules
against their v-3 counterparts and surface every pre-v-3 graceful
behavior that didn't make it across. Filing as a separate issue
or rolling into this one — operator preference.

Notes

This was found during a downstream deploy flow that exercises
the full CI path. The workflow workaround (drop --solution,
let cwd inference run) gets deploys working today without
needing this fix to land. The workflow comment cites this issue
so a future contributor will know why the explicit flag was
removed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions