Skip to content

feat(workflow-executor): use stored OAuth credentials for MCP steps [PRD-367]#1665

Open
hercemer42 wants to merge 18 commits into
feat/prd-367-pr1-executor-oauth-credentialsfrom
feat/prd-367-pr2-executor-oauth-runtime
Open

feat(workflow-executor): use stored OAuth credentials for MCP steps [PRD-367]#1665
hercemer42 wants to merge 18 commits into
feat/prd-367-pr1-executor-oauth-credentialsfrom
feat/prd-367-pr2-executor-oauth-runtime

Conversation

@hercemer42

@hercemer42 hercemer42 commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What & why

Executor-side runtime use of stored OAuth credentials for OAuth2-protected MCP servers (PRD-367, PR2). At an oauth2 MCP step the executor looks up the stored credential by (user, server), runs a refresh-token grant against the stored token_endpoint behind an expiry-skew cache, injects the Bearer token before connecting, retries once on an upstream 401 (covering both list-tools and the tool call), and pauses the step awaiting-input with awaitingInputReason: needs-oauth-reauth when there is no usable credential or the refresh is rejected. Bearer/none steps are unchanged.

Stacked PR / deploy ordering

  • Based on feat/prd-367-pr1-executor-oauth-credentials (PR feat(workflow-executor): add OAuth credential store + deposit endpoint (PRD-367 PR1) #1619), not main: PR2 depends on PR1's credential store / encryption / deposit endpoint, which is not yet merged. Review the diff against that branch.
  • Behaviour stays dormant until the Forest server serves authType + accepts the typed awaitingInputReason (PR1.5) and the frontend ships (PR3) — deploy the orchestrator first. Safe to deploy alone: the oauth2 path only activates once authType=oauth2 is served; bearer/none is unchanged.

Notable choices (large-PR annotation)

  • Serialization (option B): in-process mutex + DB re-read + single retry on invalid_grant, not a Postgres row lock — the executor and token endpoints can sit behind a client VPN, so no DB lock is held across the refresh HTTP call. Row lock is the documented hardening path if a strict reuse-detection provider plus real cross-process contention appears.
  • Executor isolation: the executor gains only a re-loadable tool source (reloadWithFreshAuth) plus a typed OAuthReauthRequiredError -> awaiting-input mapping; all token/credential/HTTP logic stays behind RemoteToolFetcher and the new OAuth token service.
  • Shared @forestadmin/ai-proxy change (consumed by the agent too) is additive-only (new loadToolsWithFailures / loadRemoteToolsWithFailures + mcp-auth-error export) plus behaviour-preserving delegation; authType is stripped in the McpClient constructor like id.
  • In-memory executor (dev-only) raises a ConfigurationError for oauth2 steps (no credential store wired).

Tests

294 tests across the touched suites (ai-proxy 59 + workflow-executor 235); build, lint, tsc clean; >=90% line/stmt coverage on changed files.

Known limitation

A re-auth at tool-call time leaves the step at idempotencyPhase=executing, so it needs a manual retry after re-auth; the common list-tools-time re-auth (pre-executor) resumes cleanly.

Part of PRD-367.

🤖 Generated with Claude Code

Note

Add stored OAuth credential support for MCP steps in workflow executor

  • Introduces OAuthTokenService to retrieve, cache, and refresh OAuth access tokens per (userId, mcpServerId), with per-key serialization via a new KeyedMutex and rotation-aware retry logic.
  • Adds refreshAccessToken and assertSafeTokenEndpoint to perform the OAuth2 refresh-token grant and validate token endpoints against SSRF risks before any network call.
  • Updates RemoteToolFetcher.fetch to inject Bearer tokens for OAuth2 MCP servers, detect auth failures, and retry once with a forced refresh before surfacing OAuthReauthRequiredError.
  • Updates McpStepExecutor and StepExecutorFactory so auth failures during tool invocation produce an awaiting-input outcome with reason needs-oauth-reauth instead of failing the step.
  • Adds a GET /list-mcp-tools HTTP endpoint for design-time tool listing, with typed 409/503 responses for reauth and missing encryption key cases.
  • POST /mcp-oauth-credentials now validates tokenEndpoint safety at deposit time; DELETE now evicts cached tokens immediately.
  • Risk: Runner and ExecutorHttpServer constructor signatures have changed to require OAuth-related dependencies.

Macroscope summarized 81b5607.

@linear-code

linear-code Bot commented Jun 16, 2026

Copy link
Copy Markdown

PRD-367

PRD-624

Comment thread packages/workflow-executor/src/oauth/token-service.ts Outdated
Comment thread packages/ai-proxy/src/mcp-auth-error.ts Outdated
Comment thread packages/ai-proxy/src/mcp-auth-error.ts Outdated
hercemer42 added a commit that referenced this pull request Jun 16, 2026
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 16, 2026
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 16, 2026
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 16, 2026
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr2-executor-oauth-runtime branch from 8d441d2 to 0e05efe Compare June 16, 2026 16:04
@qltysh

qltysh Bot commented Jun 16, 2026

Copy link
Copy Markdown

12 new issues

Tool Category Rule Count
qlty Structure Function with many returns (count = 9): create 5
qlty Structure Function with many parameters (count = 4): constructor 4
qlty Structure Function with high complexity (count = 12): invokeWithReauthRetry 3

Comment thread packages/workflow-executor/src/oauth/refresh-grant.ts
@qltysh

qltysh Bot commented Jun 16, 2026

Copy link
Copy Markdown

Qlty


Coverage Impact

⬆️ Merging this pull request will increase total coverage on feat/prd-367-pr1-executor-oauth-credentials by 0.04%.

Modified Files with Diff Coverage (22)

RatingFile% DiffUncovered Line #s
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/runner.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/executors/step-executor-factory.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/build-workflow-executor.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/remote-tool-fetcher.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/http/executor-http-server.ts95.5%348
Coverage rating: B Coverage rating: B
packages/workflow-executor/src/adapters/server-ai-adapter.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/types/validated/step-outcome.ts100.0%
Coverage rating: A Coverage rating: A
packages/ai-proxy/src/mcp-client.ts100.0%
Coverage rating: C Coverage rating: C
packages/workflow-executor/src/adapters/ai-client-adapter.ts100.0%
Coverage rating: A Coverage rating: A
packages/ai-proxy/src/ai-client.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/errors.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/executors/mcp-step-executor.ts100.0%
Coverage rating: A Coverage rating: A
...ow-executor/src/adapters/step-outcome-to-update-step-mapper.ts100.0%
Coverage rating: A Coverage rating: A
...s/workflow-executor/src/adapters/always-error-ai-model-port.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/defaults.ts100.0%
Coverage rating: A Coverage rating: A
packages/ai-proxy/src/index.ts100.0%
Coverage rating: A Coverage rating: A
packages/workflow-executor/src/http/mcp-oauth-credentials.ts100.0%
New Coverage rating: A
packages/workflow-executor/src/oauth/refresh-grant.ts100.0%
New Coverage rating: A
packages/workflow-executor/src/oauth/token-service.ts98.2%153
New Coverage rating: A
packages/workflow-executor/src/oauth/token-endpoint-url.ts100.0%
New Coverage rating: A
packages/ai-proxy/src/mcp-auth-error.ts92.3%25-27
New Coverage rating: A
packages/workflow-executor/src/oauth/keyed-mutex.ts100.0%
Total98.6%
🤖 Increase coverage with AI coding...
In the `feat/prd-367-pr2-executor-oauth-runtime` branch, add test coverage for this new code:

- `packages/ai-proxy/src/mcp-auth-error.ts` -- Line 25-27
- `packages/workflow-executor/src/http/executor-http-server.ts` -- Line 348
- `packages/workflow-executor/src/oauth/token-service.ts` -- Line 153

🚦 See full report on Qlty Cloud »

🛟 Help
  • Diff Coverage: Coverage for added or modified lines of code (excludes deleted files). Learn more.

  • Total Coverage: Coverage for the whole repository, calculated as the sum of all File Coverage. Learn more.

  • File Coverage: Covered Lines divided by Covered Lines plus Missed Lines. (Excludes non-executable lines including blank lines and comments.)

    • Indirect Changes: Changes to File Coverage for files that were not modified in this PR. Learn more.

Comment thread packages/workflow-executor/src/oauth/token-service.ts
hercemer42 added a commit that referenced this pull request Jun 17, 2026
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 17, 2026
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr2-executor-oauth-runtime branch from d4555ee to cf71d69 Compare June 17, 2026 07:41
hercemer42 added a commit that referenced this pull request Jun 18, 2026
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 18, 2026
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr2-executor-oauth-runtime branch from cf71d69 to 55309c7 Compare June 18, 2026 13:28
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr1-executor-oauth-credentials branch from d42828e to 30b063c Compare June 23, 2026 09:49
hercemer42 added a commit that referenced this pull request Jun 23, 2026
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 23, 2026
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr2-executor-oauth-runtime branch from 55309c7 to 842d468 Compare June 23, 2026 15:09
@hercemer42

Copy link
Copy Markdown
Contributor Author

Agent code review — Claude Opus 4.8 (2026-06-18)

Reviewed the folded-in commits 7bf9357b (key-rotation re-consent) and 842d4680 (GET /list-mcp-tools) via the pr-review-toolkit code-reviewer; the runtime/max-run path was covered in the earlier review pass. No issues found.

Verified:

  • list-mcp-toolsuser_id is taken from the validated JWT (never the request); the mcpServerId query is validated (400 if absent). Responses leak no secrets (200 → {name, description} per tool; 409 → {awaitingInputReason, mcpServerId}). The typed needs-oauth-reauth 409 is set on ctx directly so it survives the error middleware (which would otherwise remap OAuthReauthRequiredError → 400).
  • Key-rotation classification (toGrantParams) — the try wraps only the decrypt, so a missing key stays terminal (ExecutorEncryptionKeyMissingError rethrown) and a key-present decrypt failure → needs-oauth-reauth; the grant call sits outside the try, so invalid_grant/transient/5xx errors cannot be misclassified into a re-consent loop.
  • Domain error classes (no raw Error), no PRD refs in comments, tool serialization type-safe.

@hercemer42 hercemer42 force-pushed the feat/prd-367-pr1-executor-oauth-credentials branch from e121a1a to 270dbcc Compare June 24, 2026 08:40
hercemer42 added a commit that referenced this pull request Jun 24, 2026
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hercemer42 added a commit that referenced this pull request Jun 24, 2026
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr2-executor-oauth-runtime branch from 11aba95 to 1f58924 Compare June 24, 2026 08:51
Comment thread packages/workflow-executor/src/oauth/token-service.ts
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr1-executor-oauth-credentials branch from 270dbcc to 2974d29 Compare June 25, 2026 08:42
hercemer42 and others added 8 commits June 29, 2026 20:18
On the invalid_grant concurrent-rotation retry path the write-back used the pre-retry credential for the non-token fields; thread the credential whose token produced the grant through so a concurrent re-deposit is not partially reverted. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A 403 is a permission/scope failure that a token refresh or re-consent cannot resolve, so it no longer triggers the refresh + re-auth flow (which looped) and instead surfaces as an ordinary failure. The spec specifies retry on 401. Addresses review on #1665.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cutor [PRD-367]

PR1 wired an in-memory credential store + deposit endpoint into buildInMemoryExecutor, so the previous "in-memory raises ConfigurationError for oauth2 steps" behavior was inconsistent: a credential could be deposited but never used. Wire an OAuthTokenService into the in-memory runner (sharing the same store instance the deposit endpoint writes to) so oauth2 steps work end-to-end in dev, matching the database executor.

The token service is now a required RunnerConfig/RemoteToolFetcher collaborator (both executors provide it), so the unreachable ConfigurationError guard and its fetcher test are removed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tials port [PRD-367]

The PR1 rebase moved the credentials store interface + types from stores/mcp-oauth-credentials-store to ports/mcp-oauth-credentials-store (the store file now holds only the Database/InMemory implementations). Import McpOAuthCredentialsStore and StoredMcpOAuthCredential from the new port path so the package compiles against the rebased PR1 base.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion [PRD-367]

PR1 dropped enc_key_version from the credential store (no version-aware decrypt path), so the rotation write-back no longer carries encKeyVersion. Per PRD-367 key-rotation handling, a decrypt failure with the encryption key PRESENT (auth-tag mismatch from a since-rotated/hard-swapped key) is recoverable: toGrantParams now classifies it as OAuthReauthRequiredError (needs-oauth-reauth) so re-consent re-deposits under the new key. A missing key (ExecutorEncryptionKeyMissingError) stays terminal — re-consent cannot help and a re-deposit would 503.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ting [PRD-367]

New GET /list-mcp-tools?mcpServerId= route on the executor HTTP server for the orchestrator-engine MCP-server details page: resolves the caller's vault credential (user_id from the validated JWT, never the request), refreshes, injects the Bearer, and returns the server's tool definitions — reusing RemoteToolFetcher, no new fetch/refresh logic. A missing/unrefreshable credential returns a typed needs-oauth-reauth (409), not a generic error or empty list. Wired into both the database and in-memory executors so oauth2 tool listing works in dev too.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nses [PRD-367]

A literal JSON null (or other non-object) body from the token endpoint overwrote the {} parse default, so the subsequent payload.error / payload.access_token reads threw a TypeError instead of the typed OAuthRefreshError. Keep the {} default for non-object bodies so the status checks still surface a typed OAuthRefreshError.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nnect [PRD-367]

On DELETE /mcp-oauth-credentials/:mcpServerId, evict the in-process cached access
token for (user, server) so a disconnect takes effect immediately — otherwise the
executor keeps serving the cached token until it expires even though the credential
row is gone (surfaced by end-to-end testing).

Wires OAuthTokenService into both executor builders + the HTTP server (optional on
the options so credential-free tests need not construct one), and adds
OAuthTokenService.evict with coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@hercemer42 hercemer42 force-pushed the feat/prd-367-pr2-executor-oauth-runtime branch from 63123b6 to b6f316e Compare June 29, 2026 18:35
Comment thread packages/workflow-executor/src/executors/mcp-step-executor.ts
Comment thread packages/workflow-executor/src/oauth/token-service.ts
Comment thread packages/ai-proxy/src/mcp-auth-error.ts
@hercemer42 hercemer42 marked this pull request as ready for review June 29, 2026 18:45
hercemer42 and others added 2 commits June 29, 2026 21:11
- list-mcp-tools: map ExecutorEncryptionKeyMissingError to the typed 503
  { code } the deposit endpoint already returns, so the details page shows
  the admin message instead of a generic error.
- token-service: encrypt the rotated refresh token inside the best-effort
  write-back try, so an encrypt failure can't fail an otherwise-valid
  getAccessToken.
- ai-proxy isMcpAuthError: an explicit status is authoritative — a 403 is
  not refreshable even when its message says "unauthorized"; fall back to
  the message only when no status is present.
- Trim comments to why-not-how.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The re-read recovery path only handles a peer instance's rotation, not our
own failed write-back (the stored token is unchanged, so re-read sees no
rotation and forces re-auth). Describe the actual fallback: re-authentication.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/workflow-executor/src/oauth/refresh-grant.ts
Comment thread packages/workflow-executor/src/oauth/token-service.ts
…RD-624]

The token endpoint is where the executor POSTs the refresh grant (with
client credentials), but it was stored as an unconstrained string, so an
authenticated caller could aim the executor at an internal address (SSRF,
incl. cloud metadata).

Add assertSafeTokenEndpoint, enforced at deposit (400) and again before the
refresh POST (defence in depth for pre-existing rows):
- scheme must be http(s); https required in production (http stays allowed
  off-prod for the local OAuth sim);
- link-local / cloud-metadata (169.254.0.0/16, fe80::/10) blocked everywhere;
- loopback blocked in production;
- RFC1918 private ranges stay allowed — private OAuth providers are supported.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/workflow-executor/src/oauth/token-endpoint-url.ts
hercemer42 and others added 2 commits June 29, 2026 22:07
…d [PRD-624]

The URL parser normalizes `::ffff:127.0.0.1` to the hex form `::ffff:7f00:1`,
which the IPv6 branch classified as neither loopback nor link-local — so a
mapped loopback/metadata address slipped past the SSRF block in production.
Extract the embedded IPv4 from `::ffff:` addresses and run it through the
IPv4 loopback/link-local checks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…omment [PRD-624]

Match the codebase convention (run-to-available-step-mapper.ts).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/workflow-executor/src/oauth/token-endpoint-url.ts Outdated
Comment thread packages/workflow-executor/src/oauth/token-endpoint-url.ts Outdated
Drop the local-test-tool and ticket references; explain the why (SSRF
rationale, the deliberate RFC1918 allowance, why no DNS resolution) rather
than enumerating the rules.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nt guard [PRD-624]

0.0.0.0 (and ::) connects to loopback on Linux, so it was an SSRF bypass the
guard accepted. Classify the unspecified range (0.0.0.0/8, ::) and reject it
on every environment — it is never a valid token endpoint.

Surfaced by evaluating ipaddr.js, which flags it as `unspecified`; the WHATWG
URL parser already normalizes the decimal/hex/short-form IP encodings to
dotted form, so those were already covered.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/workflow-executor/src/oauth/token-endpoint-url.ts Outdated
…oken guard [PRD-624]

Only the bare `localhost` was treated as loopback, so `localhost.` and the
`.localhost` TLD (RFC 6761) — both resolving to loopback — slipped past the
guard in production. Match any host whose last label is `localhost` (with or
without a trailing dot), without over-blocking real domains like
auth.localhost.example.com.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/workflow-executor/src/oauth/token-endpoint-url.ts
…t guard [PRD-624]

Regression test proving `127.0.0.1.` / `169.254.169.254.` are rejected: the
guard runs `new URL()` first, which normalizes the trailing dot before
classification, so the trailing-dot form is not a bypass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@Scra3 Scra3 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of the OAuth2 MCP executor runtime (PRD-367 PR2). One blocking security issue (SSRF via redirect on the token endpoint) plus a few non-blocking hardening points. Note: the tool-call re-auth idempotency-phase issue is already fixed in the stacked #1724, so it's not raised here.

let response: Awaited<ReturnType<typeof fetch>>;

try {
response = await fetch(params.tokenEndpoint, { method: 'POST', headers, body });

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (blocking): fetch() follows redirects by default, so a deposited token endpoint that passes assertSafeTokenEndpoint but 3xx-redirects to 169.254.169.254/loopback re-sends the POST body to the internal host — leaking the refresh_token (and the client_secret in client_secret_post mode) and parsing the internal response as a token; pass redirect: 'manual' and treat any 3xx as an OAuthRefreshError.

await store.delete(userId, ctx.params.mcpServerId);
// Evict any in-process cached access token so the disconnect is immediate, not deferred until
// the cached token expires (the row is gone, but the runtime would otherwise still serve it).
this.options.oauthTokenService?.evict(userId, ctx.params.mcpServerId);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (non-blocking): the POST deposit handler has no symmetric evict(), so re-depositing a credential (reconnect without a prior DELETE) keeps serving the previously cached access token until it expires — mirror this evict on the deposit path.

mcpServerId,
);

this.cache.set(key, {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (non-blocking): evict() runs outside the KeyedMutex, so a disconnect landing during an in-flight refresh no-ops on the empty cache and this cache.set then re-populates a just-deleted credential (served until expiry); it is also process-local, so on a multi-instance deployment other instances never drop the token — consider an eviction epoch checked before cache.set.

throw new InvalidTokenEndpointError('it points at a link-local / metadata address');
}

if (isProduction()) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: the loopback block and the https requirement are gated on NODE_ENV === 'production', so a real deployment with NODE_ENV unset or misspelled silently allows loopback targets and cleartext http — prefer failing closed and opting into the relaxation for dev/test explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants