Skip to content

fix(register): make client_id TTL configurable, default 7d#28

Merged
babs merged 1 commit intomasterfrom
fix/configurable-client-registration-ttl
Apr 27, 2026
Merged

fix(register): make client_id TTL configurable, default 7d#28
babs merged 1 commit intomasterfrom
fix/configurable-client-registration-ttl

Conversation

@babs
Copy link
Copy Markdown
Owner

@babs babs commented Apr 27, 2026

Summary

The sealed client_id minted by POST /register previously used a hardcoded 24h TTL. Refresh tokens last 7 days, so any MCP client running longer than 24h failed its first refresh-token rotation with:

```
OAuth token refresh failed: Server returned error response:
invalid_client: client registration expired
```

even with a still-valid refresh token. Most MCP clients (codex_rmcp_client, Claude Code, Cursor, …) treat DCR as one-shot at startup, so they surface the error to the user instead of re-registering.

Fix

  • Default TTL bumped to 7 days, matching `refreshTokenTTL` so the sealed `client_id` always covers a full refresh-token cycle.
  • New `CLIENT_REGISTRATION_TTL` env var lets operators raise or lower the envelope. Go duration syntax (`168h`, `720h`, …); capped at 90d at startup — wider windows extend the residual-reuse threat of an exfiltrated `client_id` with no operational upside.
  • `Register` signature gains a `clientTTL` parameter. All call sites (`main.go` + tests + `e2e_test.go`) pass either the wired `cfg.ClientRegistrationTTL` or `handlers.DefaultClientTTL`.

Tests

  • Regression: `TestTokenRefresh_ClientLifecycleAroundTTL` covers three boundary points:
    • 48h-old (the bug repro — was failing pre-fix)
    • 6d23h-old (just inside the 7d default envelope, must succeed)
    • 1m past expiry (must reject with `invalid_client`)
  • Config validation (5 tests): default, custom value, non-positive reject, above-cap reject, bad-format reject (catches the `"7d"` footgun — `time.ParseDuration` doesn't accept the `d` suffix).

Rolling-deploy note

The TTL is sealed into each `client_id` at registration time. Bumping the env var only affects newly-issued `client_id`s; existing registrations stay on whatever TTL was in effect when they were minted. Currently-affected MCP clients have to re-register once (most clients trigger DCR on a fresh connection — a client restart suffices).

Documented in the new `docs/runbooks/client-registration-expired.md`, with operator playbook + the explicit "do NOT" list (no infinite TTL, no in-place extension).

Security trade-off

A longer `client_id` lifetime means a longer window for an exfiltrated `client_id` to be reused. Two mitigations bound that:

  1. 90d hard cap at config load. An operator can't accidentally configure indefinite reuse.
  2. `client_id` is unauthenticated metadata — it identifies the registration but doesn't authenticate. Reuse still requires driving the user through OIDC + consent (or holding a paired `refresh_token` for the rotation path).

A stateful "auto-extend on use" pattern (so an active client never expires while idle ones still age out) is a meaningful design change, not just a TTL bump — kept on the local backlog for later.

Test plan

  • CI green (`test`, `lint`, `keycloak-e2e`, `fuzz-smoke`, `manifest-prod`, `govulncheck`, `build`).
  • Manual: deploy with default `CLIENT_REGISTRATION_TTL` unset, register a fresh client, observe `client_id_expires_at - client_id_issued_at ≈ 604800` (7 × 86400) in the `/register` response.
  • Operator-side acknowledgement: existing MCP clients hit by the bug need to re-register once after the deploy. Restart the client.

The sealed client_id minted by POST /register previously used
a hardcoded 24h TTL. Refresh tokens last 7 days, so any MCP
client running longer than 24h failed its first refresh-token
rotation with `invalid_client: client registration expired`,
even with a still-valid refresh token. Most MCP clients
(codex_rmcp_client, Claude Code, Cursor, …) treat DCR as
one-shot at startup and surface the error to the user
instead of re-registering.

Fix:

  - Default TTL bumped to 7 days, matching refreshTokenTTL so
    the sealed client_id always covers a full refresh-token
    cycle.

  - New CLIENT_REGISTRATION_TTL env var lets operators raise
    or lower the envelope. Go duration syntax; capped at 90d
    at startup (wider windows extend the residual-reuse threat
    of an exfiltrated client_id with no operational upside).

  - Register handler signature gains a clientTTL parameter.
    All call sites (main.go + tests) pass either the wired
    cfg.ClientRegistrationTTL or DefaultClientTTL.

  - Regression test (TestTokenRefresh_ClientLifecycleAroundTTL)
    covers three boundary points: 48h-old (the bug repro),
    6d23h-old (just inside the 7d envelope), and 1m past
    expiry (must reject).

  - 5 config tests for the env var: default, custom, non-
    positive reject, above-cap reject, bad-format reject (the
    "7d" case — time.ParseDuration doesn't accept the d
    suffix; documented).

Rolling-deploy note: the TTL is sealed into each client_id at
registration time. Bumping the env var only affects newly-
issued client_ids; existing registrations stay on whatever
TTL was in effect when they were minted. Documented in the
new docs/runbooks/client-registration-expired.md.
@babs babs merged commit 2597d5f into master Apr 27, 2026
7 checks passed
@babs babs deleted the fix/configurable-client-registration-ttl branch April 27, 2026 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant