Skip to content

fix: share AuthResolver across install to prevent duplicate auth popups#424

Open
frblondin wants to merge 6 commits intomicrosoft:mainfrom
frblondin:fix/shared-auth-resolver-prevent-duplicate-popups
Open

fix: share AuthResolver across install to prevent duplicate auth popups#424
frblondin wants to merge 6 commits intomicrosoft:mainfrom
frblondin:fix/shared-auth-resolver-prevent-duplicate-popups

Conversation

@frblondin
Copy link
Contributor

#394 which was meant to be a continuation of #389, but the second commit was missed from the pull request. This pull request is a reimplementation of this change.

Problem

When running apm install against a plugin stored in a private Enterprise GitHub organization, APM triggered one authentication popup per network operation. For a plugin that contains a single skill file and a single MCP server definition, the number of credential-helper invocations could reach three in the same command invocation:

  1. Repository validation - git ls-remote (or GitHub API) to confirm the plugin repository exists and is accessible.
  2. Plugin manifest download - fetching plugin.json from the repository.
  3. Skill file download - fetching the skill SKILL.md asset.

Each of those calls created its own AuthResolver instance with an empty cache, so the OS credential helper was invoked fresh every time - producing three separate authentication popups in rapid succession. Installing 10 similar plugins would show 30 GitHub authentication popups.

Minimal reproducible scenario - an organization plugin corp-org/acme-docs-plugin:

{
  "name": "acme-docs",
  "version": "1.0.0",
  "description": "Internal documentation assistant for Copilot.",
  "repository": "corp-org/acme-docs-plugin",
  "skills": ["./skills/acme-docs"],
  "mcpServers": {
    "com.corp-org/acme-docs": {
      "command": "npx",
      "args": ["--registry", "https://registry.npmjs.org", "@corp-org/acme-docs-assistant"]
    }
  }
}
---
name: acme-docs
description: Search and manage internal documentation via the acme-docs MCP server.
---
# Acme Docs Skill
Guidance for working with the internal documentation portal via MCP tools.

Running apm install corp-org/acme-docs-plugin against a GitHub Enterprise host producing:

[*] Validating 1 package...
[popup 1] Enter credentials for https://github.example.com (validating repo)
[>] Installing 1 new package..
[popup 2] Enter credentials for https://github.example.com (downloading plugin.json)
[popup 3] Enter credentials for https://github.example.com (downloading skills/acme-docs/SKILL.md)

Root cause

Two independent issues compounded each other:

  1. No resolver sharing - _validate_package_exists and _install_apm_dependencies each instantiated their own AuthResolver(), so no cached token was ever reused across calls within the same apm install run.

  2. Lock inversion in AuthResolver.resolve() - the cache lookup and the cache fill were in separate with self._lock blocks, so two concurrent threads resolving the same (host, org) key would both miss the cache and each trigger their own credential-helper lookup before either result was written back.

Fix

  • A single AuthResolver instance is created at the top of the install command and passed down through _validate_and_add_packages_to_apm_yml, _validate_package_exists, and _install_apm_dependencies. All three network operations now share the same cache.
  • AuthResolver.resolve() has been tightened to a single with self._lock block: the cache check, the credential resolution, and the cache write all happen inside one critical section, eliminating the race.

With the fix, the same apm install corp-org/acme-docs-plugin scenario produces exactly one popup; all subsequent calls hit the in-memory cache.

Tests

  • test_auth.py - verifies the lock consolidation and that a second concurrent call for the same key never triggers a second token resolution.
  • test_github_downloader.py - verifies that a shared AuthResolver passed to GitHubPackageDownloader has its cached token reused across multiple download calls.

Create a single AuthResolver at the top of the install command and
thread it through _validate_and_add_packages_to_apm_yml,
_validate_package_exists, and _install_apm_dependencies so that all
validation and download steps within one CLI invocation share the same
credential cache.

Also fixes a lock inversion in AuthResolver.resolve() that allowed
two concurrent callers for the same (host, org) key to each trigger
their own credential-helper lookup before either result was cached.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent repeated OS credential-helper prompts during apm install by sharing a single AuthResolver instance across install phases and tightening AuthResolver.resolve() caching under concurrency.

Changes:

  • Pass a shared AuthResolver through install validation and dependency download flows to reuse the in-memory auth cache.
  • Make AuthResolver.resolve() populate/cache within a single lock to avoid concurrent duplicate credential resolutions.
  • Add unit tests covering concurrent singleflight caching and ensuring downloader construction does not eagerly invoke credential helper resolution.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/apm_cli/commands/install.py Wires a shared AuthResolver through validation and install paths.
src/apm_cli/core/auth.py Consolidates resolve cache check/fill under one lock for concurrency safety.
src/apm_cli/deps/github_downloader.py Avoids credential resolution at downloader construction time; keeps auth lazy/per-dependency.
tests/unit/test_auth.py Adds concurrency test asserting singleflight cache behavior.
tests/test_github_downloader.py Adds regression test asserting constructor does not call git credential helper.
dist.old/apm-windows-x86_64.sha256 Adds a checksum artifact file.

Copy link
Collaborator

@danielmeppiel danielmeppiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Panel Review — Security, Architecture, UX

Thanks for the contribution @frblondin! The core direction is architecturally sound — threading a single AuthResolver through the install pipeline and consolidating the TOCTOU lock race are both correct fixes. However, the implementation has gaps that would leave users still seeing multiple auth popups and could break credential-helper-only users.

Reviewed by three specialized perspectives: Auth/Security, Python Architecture, and CLI UX.


What is good

  • Lock consolidation in resolve() is correct and essential. The old split-lock had a textbook TOCTOU race — N threads each miss the cache between two with self._lock blocks, each invoke git credential fill, causing N popups. The single-lock fix is bounded by the existing 60s subprocess timeout (APM_GIT_CREDENTIAL_TIMEOUT). No deadlock possible (single lock, no nesting).
  • Parameter threading pattern (auth_resolver=None with fallback) follows existing conventions in GitHubPackageDownloader.__init__.
  • Tests are well-targeted — concurrent singleflight test and no-eager-credential-helper test cover the key scenarios.
  • Import safetyauth.py depends only on stdlib + internal utils, so moving it outside APM_DEPS_AVAILABLE is safe.

Critical — must fix before merge

1. Incomplete cache sharing — 3 of 4 resolver creation sites are not connected

The PR threads the resolver through _validate_and_add_packages_to_apm_yml but misses three other sites. A user installing a private GHE package would still see 3 auth popups instead of the promised 1:

Site Location Shared?
_validate_package_exists L225 auth_resolver param Yes
_validate_package_exists L271 GitHubPackageDownloader()no auth_resolver= No
_install_apm_dependencies L1070 GitHubPackageDownloader()no auth_resolver= No
Verbose auth logging L1431 _auth_resolver = AuthResolver() No

Fix: Pass auth_resolver=auth_resolver to all GitHubPackageDownloader() calls. For L1431, reuse downloader.auth_resolver instead of creating a fresh instance.

2. Lazy init in GitHubPackageDownloader breaks credential-helper-only users

The constructor no longer calls self.auth_resolver.resolve(default_host()), which means self.github_token, self.has_github_token, and self.has_ado_token are now set from env vars only (token_manager.get_token_for_purpose). Users who rely solely on gh auth login (credential helper, no env vars) will have self.github_token = None and self.has_github_token = False even though credentials are available.

This affects at least:

  • _build_repo_url() L516 — self.github_token fallback silently produces unauthenticated URL
  • _clone_with_fallback() L595 — dep_token falls back to None
  • Error handling L667 — elif not self.has_github_token: may produce wrong error messages

Recommended fix: Keep _setup_git_environment() eager in __init__ but remove only the resolve() calls from it. The env-only token lookup is fine for setting initial state. Per-dependency auth resolution already happens in _clone_with_fallback() L590 via self.auth_resolver.resolve_for_dep(dep_ref). Alternatively, use @property lazy init so these attributes are populated on first access.

3. GHES/GHE _build_repo_url without per-dep auth (also flagged by Copilot inline review)

At L296, the non-github.com validation path builds a URL via GitHubPackageDownloader._build_repo_url(...) without resolving per-dependency auth first. Since the constructor no longer resolves credentials eagerly, package_url will be tokenless for GHES/GHE hosts, and git ls-remote will trigger an OS credential popup — reintroducing the exact problem this PR targets.

Fix: Resolve auth for this dep_ref via the shared auth_resolver before building the URL:

ctx = auth_resolver.resolve_for_dep(dep_ref)
# then pass token=ctx.token to _build_repo_url or use try_with_fallback

Medium — should fix in this PR

4. Remove dist.old/apm-windows-x86_64.sha256

This is an accidental build artifact. Commit 2 empties it but the file is still tracked. It should be fully removed (git rm) — nothing in the codebase references dist.old/, and dist/ is already in .gitignore.

5. Import placement — verify final state

Commit 1 puts from ..core.auth import AuthResolver inside the try: APM_DEPS_AVAILABLE block. Commit 3 moves it outside. The final state is correct, but please verify that the intent is clear: AuthResolver must be unconditionally importable (it has no optional deps), so it should live with the other top-level imports, not inside the APM_DEPS_AVAILABLE guard.


Low — nice to have

6. Document the lock hold trade-off

A brief comment in resolve() explaining why the lock is held during credential resolution would help future maintainers:

# Hold lock during entire resolution to prevent duplicate credential-helper
# popups when parallel downloads resolve the same (host, org) concurrently.
# Bounded by APM_GIT_CREDENTIAL_TIMEOUT (default 60s). After first resolution,
# all subsequent calls for the same key are O(1) cache hits.

7. _default_github_ctx is dead code

self._default_github_ctx is assigned but never read anywhere in the codebase. Consider removing it entirely to reduce confusion.


Summary

Finding Severity Status
3 of 4 resolver sites not connected Critical Must fix
Lazy init breaks credential-helper users Critical Must fix
GHES _build_repo_url without auth Critical Must fix
dist.old artifact tracked Medium Should fix
Import placement clarity Medium Should fix
Lock hold documentation Low Nice to have
Dead _default_github_ctx field Low Nice to have

The architectural direction is right. Once the three critical items are addressed, this will be a solid fix. Happy to re-review after changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants