feat(replay): single-use claim on consent + callback state by babs · Pull Request #22 · babs/mcp-auth-proxy

babs · 2026-04-27T21:33:30Z

Summary

Closes T1.2 + T2.1 from misc/next-steps.md. Wires the existing replay store into two sealed-blob redemption sites that previously relied on TTL alone:

Consent token (POST /consent) — single-use claim on a per-render jti. Each GET /authorize render mints a fresh jti, so back-button = re-consent (a new claim slot) rather than a stuck-state error.
Callback state (GET /callback) — single-use claim on a per-session sid before the upstream OIDC exchange. A replayed /callback URL never fans out to the IdP token endpoint and never produces audit-log noise.

Eliminates the only "deliberate trade-off" footnote in specs.md (the consent-replay paragraph at the old line 337) and ships the audit-noise / IdP-fan-out fix the runbook called out for /callback.

Behavior when the replay store is wired (`REDIS_URL` set)

Site	Replay detected	Store error
`POST /consent`	400 `invalid_request` + `error_code=consent_replay` + `mcp_auth_replay_detected_total{kind="consent"}++`	503 `server_error` + `error_code=replay_store_unavailable` + `mcp_auth_access_denied_total{reason="replay_store_unavailable"}++`
`GET /callback`	400 `invalid_request` + `error_code=callback_state_replay` + `mcp_auth_replay_detected_total{kind="callback_state"}++`	(same shape)

Reuses the existing mcp_auth_access_denied_total{reason="replay_store_unavailable"} counter from /token so one alert rule covers every claim site. All four replay_store_error log lines now carry an op field (claim_authz_code / claim_refresh_family / claim_consent / claim_callback_state) for site-level correlation.

Backwards compatibility

nil replay store → stateless fallback (configured opt-out, matches /token).
Empty JTI / SessionID on tokens sealed by older binaries → stateless fallback. Lets a rolling deploy avoid 503-ing active users for the duration of consentTTL (5 min) / sessionTTL (10 min).

Deliberate non-claim

The /callback IdP-error redirect path (RFC 6749 §4.1.2.1) does NOT claim — the redirect is idempotent (no IdP fan-out, no token issuance) and claiming would burn the state on a transient IdP error and force re-auth on every legit retry. Documented inline in handlers/callback.go.

Test plan

CI green (test, lint, keycloak-e2e, fuzz-smoke, manifest-prod, govulncheck, build).
10 new handler tests in handlers/single_use_replay_test.go:
- TestConsent_SingleUse_ReplayDetected — second POST → 400 consent_replay, replay metric +1, approved counter does NOT double-count.
- TestConsent_SingleUse_ClaimsBeforeDeny — Deny is also one-shot.
- TestConsent_SingleUse_NilStoreFallback — nil store keeps the prior stateless behavior.
- TestConsent_SingleUse_LegacyEmptyJTI — token with empty JTI succeeds (in-flight rollout).
- TestConsent_SingleUse_StoreErrorFailClosed — 503 + replay_store_unavailable metric.
- TestCallback_SingleUse_ReplayDetected — second /callback → 400 callback_state_replay, replay metric +1.
- TestCallback_SingleUse_LegacyEmptySID — empty sid does NOT hit the replay branch.
- TestCallback_SingleUse_StoreErrorFailClosed — 503 + metric, verifyFunc not called.
- TestAuthorize_RenderConsent_PopulatesJTI — GET /authorize (consent fork) seals a non-empty jti.
- TestAuthorize_SilentRedirect_PopulatesSessionID — silent path seals a non-empty sid.
No existing test broke (full suite + -race runs locally).

Docs

specs.md — /consent paragraph reworked from "Why no replay-store integration" → "Single-use replay defense"; /callback step list grew step 3 ("Single-use claim on sid"); subsequent steps renumbered.
README.md — mcp_auth_replay_detected_total{kind} documentation extended with consent and callback_state.

Closes the only documented security trade-off in specs.md (consent_token replay) and removes IdP-fan-out + audit noise on retried /callback. When a replay store is wired (REDIS_URL): - GET /authorize seals a fresh JTI on every consent render and a fresh SessionID on every silent-redirect session. Per-render JTI keeps back-button safe (a re-render gets a new claim slot) while still single-use within one render. - POST /consent ClaimOnce on JTI before either approve or deny — a captured consent_token cannot be replayed for either decision. - GET /callback ClaimOnce on session.SessionID before the upstream OIDC exchange — a replayed /callback URL never fans out to the IdP token endpoint and never produces audit-log noise. - Replay detected → 400 invalid_request with error_code consent_replay / callback_state_replay, increments mcp_auth_replay_detected_total{kind=consent|callback_state}. - Store unreachable → 503 with error_code replay_store_unavailable, increments mcp_auth_access_denied_total{reason="replay_store_unavailable"} (reuses the existing /token counter so one alert rule covers every claim site). - nil store → stateless fallback (configured opt-out, mirrors /token). - Empty JTI / SessionID on tokens sealed by older binaries falls through to stateless behavior so an in-flight rollout doesn't 503 active users for the duration of consentTTL / sessionTTL. - The /callback IdP-error redirect path deliberately does NOT claim — the redirect is idempotent (no IdP fan-out, no token issuance) and claiming would force re-auth on every legit transient-error retry. - All four replay_store_error log sites now carry an `op` field (claim_authz_code / claim_refresh_family / claim_consent / claim_callback_state) for site-level log correlation. specs.md and README updated. 10 new handler tests cover replay detection, deny-replayed, nil-store fallback, legacy empty JTI/SID, store-error fail-closed, and JTI/SID population by /authorize for both forks.

Closes T2.3 from misc/next-steps.md. Adds a configurable grace window during which a refresh-rotation claim collision is treated as a benign concurrent submit (parallel-tab refresh, slow-network double-submit) and surfaces as 429 + error_code refresh_concurrent_submit + Retry-After: 2 instead of revoking the family. Outside the window the strict pre-T2.3 behavior applies — every collision still revokes the lineage (RFC 6749 §10.4 / OAuth 2.1 §6.1). REFRESH_RACE_GRACE_SEC tunes the window: default 2s, clamped to [0, 10] at startup. The 10s ceiling is a security cap (wider windows are statistically attacker-shaped); 0 disables the grace and keeps the strict pre-T2.3 contract for operators who prefer it. Implementation: - replay.Store.ClaimOrCheckFamily gains a graceWindow parameter and a third return bool `racing`. Atomicity preserved: the Lua script on Redis and the single-mutex section on MemoryStore both classify {revoked, racing, claimed, fresh} in one linearizable step, so the new branch cannot deadlock or double-decide under concurrent load. - The Lua script now stores epoch-ms timestamps (passed by Go so the proxy is authoritative about its clock) as claim values; legacy "1" placeholder values from pre-T2.3 binaries fall through to the strict revoke path during a rolling deploy. Documented in the script docstring + threat-model row 5. - MemoryStore gets a memoryEntry struct (expiresAt + setAt) so the racing branch can compare against the original claim time without polluting Mark semantics. - 429 on /token is a deliberate deviation from RFC 6749 §5.2 (which defines 400/401 for token-endpoint errors). Most OAuth client libraries treat 429+Retry-After as "back off and retry", which is exactly the desired behavior. Documented in specs.md. - Tests: 2 handler-level (racing happy path + zero disables grace), 2 per store (racing within grace + past-grace revokes), 6 config (default + zero + custom + non-int + negative + above-ceiling). - All four replay_store_error log sites already carry the `op` field from PR #22; this change preserves it.

babs merged commit ad18290 into master Apr 27, 2026
7 checks passed

babs deleted the feat/single-use-sealed-blobs branch April 27, 2026 21:40

babs mentioned this pull request Apr 27, 2026

feat(replay): refresh-rotation race grace window #24

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(replay): single-use claim on consent + callback state#22

feat(replay): single-use claim on consent + callback state#22
babs merged 1 commit intomasterfrom
feat/single-use-sealed-blobs

babs commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

babs commented Apr 27, 2026

Summary

Behavior when the replay store is wired (REDIS_URL set)

Backwards compatibility

Deliberate non-claim

Test plan

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Behavior when the replay store is wired (`REDIS_URL` set)