Skip to content

Sec #6: Prevent capability shadowing#517

Merged
kgprs merged 3 commits into
mainfrom
codex/prevent-capability-shadowing
Jun 22, 2026
Merged

Sec #6: Prevent capability shadowing#517
kgprs merged 3 commits into
mainfrom
codex/prevent-capability-shadowing

Conversation

@kgprs

@kgprs kgprs commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

  • reject duplicated prompt names, resource URIs, and resource template URI templates before SDK registration can replace entries
  • apply the checks during startup and dynamic server reloads, with a final under-lock revalidation immediately before dynamic capability registration
  • reserve the dynamic gateway prompt name mcp-discover when dynamic tools are enabled

Behavior note

Prompt policy filtering now also runs on the dynamic-reload path before prompt collision validation. Previously that path filtered tools only, so denied prompts could still remain in dynamic available capabilities.

Root cause

PR #513 protected exposed tool names, but the MCP SDK also replaces prompts, resources, and resource templates when the same prompt name, resource URI, or resource template URI template is registered. The gateway did not validate those identifiers, so later capabilities could shadow earlier registrations.

Validation

  • go test ./pkg/gateway
  • go test ./pkg/gateway -run 'TestValidateExternalCapabilityNameCollisions|TestUpdateServerCapabilitiesRevalidatesCapabilityCollisionsUnderLock'
  • go test ./... fails in existing integration areas in this local environment: TestIntegrationWithElicitation and TestIntegrationShortLivedContainerCloses hit MCP initialize EOFs, and TestIntegrationCallToolClickhouse reports Docker unknown flag: --gateway-arg.

@kgprs kgprs marked this pull request as ready for review June 22, 2026 18:16
@kgprs kgprs requested a review from a team as a code owner June 22, 2026 18:16

@arielho-docker arielho-docker left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness pass

Read both reload paths, the validator, the capability types, and all four call sites. The validator logic is sound and fail-closed — both validations run before any state mutation in reloadServerCapabilities, so a collision returns early with the server untouched, and callers log/return or surface to the user. Self-exclusion (registeredCapabilityNameIndexes(serverName)), policy-before-collision ordering, reserved-prompt gating on DynamicTools, deterministic error ordering, and cross-type namespace independence all check out.

Two findings inline:

  • High: validate (in reloadServerCapabilities) and register (in updateServerCapabilities) run under two separate lock acquisitions, and the new prompt/resource/template index keys off serverCapabilities, which is only updated in the second — so concurrent reloads of distinct servers introducing the same new identifier can both pass and the later one silently shadows. Cheapest fix: re-validate inside updateServerCapabilities.
  • Low: the validator trims identifiers for comparison while the SDK keys on the raw string (over-rejection, errs safe).

Also worth noting in the description: routing prompt-policy filtering through the dynamic-reload path is a real behavior change (that path previously filtered only tools), not just a refactor.

🤖 Reviewed with Claude Code

Comment thread pkg/gateway/reload.go
if err := validateExternalToolNameCollisions(allowedServerCaps.Tools, existingToolRegistrations); err != nil {
return nil, err
}
if err := validateExternalCapabilityNameCollisions(allowedServerCaps, g.registeredCapabilityNameIndexes(serverName), g.DynamicTools); err != nil {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — validation and registration are split across two lock acquisitions, so this check can still be raced.

This validation runs while reloadServerCapabilities holds capabilitiesMu (acquired at L403, released on return). But the actual AddPrompt/AddResource/AddResourceTemplate happens later in updateServerCapabilities, under a separate lock — see every caller: run.go:584591, run.go:720726, mcpadd.go:235254, activateprofile.go:298307. The lock is released in between, and updateServerCapabilities does not re-validate before registering.

Concrete race — two concurrent dynamic reloads of servers A and B both introducing prompt "summarize" (or the same resource URI / template):

  1. A validates against serverCapabilities (no B) → passes; releases lock; A has not written serverCapabilities yet.
  2. B validates against serverCapabilities (no A — A only wrote serverAvailableCapabilities) → passes.
  3. A's updateServerCapabilities registers "summarize", sets serverCapabilities[A].
  4. B's updateServerCapabilities registers "summarize"silently shadows A — exactly what this PR is meant to prevent.

Window is narrow (concurrent reloads of distinct servers introducing the identical new identifier — e.g. two list_changed firing together, or mcp-add racing a notification), and the validate/register split is inherited from #513. But see the note on registeredCapabilityNameIndexes below — the new checks key off state updated later than the tool check does, so they have a strictly wider window.

Cheapest fix: re-validate inside updateServerCapabilities under its lock, immediately before the Add* calls. (Alternatives: record a pending claim into serverCapabilities/a side index within this lock, or hold one contiguous lock across validate→register.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1cddacf8 by re-validating inside updateServerCapabilities while capabilitiesMu is held, immediately before removals/adds. That closes the split validate/register window for concurrent dynamic reloads because the second registrar now checks against any server that committed serverCapabilities after the earlier reloadServerCapabilities pass.

I also added TestUpdateServerCapabilitiesRevalidatesCapabilityCollisionsUnderLock for this path.

Comment thread pkg/gateway/reload.go
ResourceTemplates: make(map[string]string),
}

for serverName, caps := range g.serverCapabilities {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This index is built from g.serverCapabilities, which is written only in updateServerCapabilities (L585) — after the validating lock in reloadServerCapabilities is released. That's what widens the race in the comment on L426: the tool collision check reads g.toolRegistrations, which reloadServerCapabilities updates inside its own lock (L434–446), so a concurrent tool reload sees pending tools. The prompt/resource/template tracking has no such in-lock update, so a concurrent reload validates against a serverCapabilities snapshot that is missing the other reload's pending capabilities.

If you adopt the re-validate-in-updateServerCapabilities fix this becomes moot; otherwise consider recording the claim here (under this lock) so a concurrent validation observes it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by the same updateServerCapabilities revalidation in 1cddacf8. The earlier registeredCapabilityNameIndexes call is still useful for fast fail-before-claim behavior, but the decisive check now also runs at the registration boundary under the lock, after any competing reload has had a chance to update serverCapabilities.

Comment thread pkg/gateway/tool_names.go Outdated
seen := make(map[string]capabilityIdentityRegistration, len(registrations))

for _, registration := range sortedCapabilityIdentityRegistrations(registrations) {
identifier := strings.TrimSpace(registration.identifier)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — the validator trims for comparison, but the SDK keys on the raw identifier.

identifier is the trimmed value, and it's used for the reserved / seen / existing lookups — but registration (AddPrompt/AddResource/…) uses the raw, untrimmed name as the SDK key. So the validator's notion of "equal" differs from the SDK's:

  • "x" and " x" are flagged as colliding even though the SDK treats them as distinct keys (false positive — rejects a valid config).
  • " mcp-discover " is rejected as reserved although it wouldn't actually shadow the gateway's mcp-discover.

It errs toward rejecting, so it's not a security hole — but it should key on the exact value the SDK registers. Keep the empty-after-trim check as a defensive guard, but do the reserved/seen/existing comparisons on the raw registration.identifier (or trim consistently at registration time too).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1cddacf8. The validator still rejects identifiers that are empty after trimming, but reserved/seen/existing lookups now use the raw registration.identifier, matching the key the SDK registers.

Added TestValidateExternalCapabilityNameCollisionsUsesRawIdentifiersForComparison to cover "summarize" vs " summarize " and " mcp-discover ".

arielho-docker
arielho-docker previously approved these changes Jun 22, 2026
@kgprs kgprs merged commit 4833d8c into main Jun 22, 2026
8 checks passed
@kgprs kgprs deleted the codex/prevent-capability-shadowing branch June 22, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants