Skip to content

feat(gateway): migrate GatewayHost.ResolveSessionType off SessionId.IsSubAgent substring (Phase 5 / F-6 step 2)#557

Merged
sytone merged 2 commits into
mainfrom
feat/resolve-session-type-by-kind
May 26, 2026
Merged

feat(gateway): migrate GatewayHost.ResolveSessionType off SessionId.IsSubAgent substring (Phase 5 / F-6 step 2)#557
sytone merged 2 commits into
mainfrom
feat/resolve-session-type-by-kind

Conversation

@sytone

@sytone sytone commented May 26, 2026

Copy link
Copy Markdown
Owner

Summary

Phase 5 / F-6 step 2: migrate GatewayHost.ResolveSessionType off the legacy SessionId.IsSubAgent substring check. The sub-agent session-type discriminator is now driven by the typed AgentDescriptor.Kind signal sourced from IAgentRegistry. Closes #554.

This is the second migration step after PR #556 (which introduced AgentKind). The substring path stays alive only in the architecturally-allowlisted call sites: SessionStoreBase.InferSessionType (legacy read path), InProcessIsolationStrategy (defense-in-depth OR gate), and SessionsController (deferred to #555).

Changes

File Change
src/gateway/BotNexus.Gateway/GatewayHost.cs ResolveSessionType is now an instance method that reads _registry?.Get(session.AgentId)?.Kind instead of session.SessionId.IsSubAgent. New optional IAgentRegistry? registry = null ctor parameter (preserves back-compat with the 7 test composition roots). Cached CronChannelKey static field to remove per-dispatch allocation. New isNewSession parameter to enable the persisted-AgentSubAgent preservation rule (see Critique below).
tests/architecture/BotNexus.Architecture.Tests/AgentKindArchitectureTests.cs Removed GatewayHost.cs allowlist entry. Rewrote XML doc remarks to record migration history and call out SessionsController (#555) as the remaining deferred site.
tests/gateway/BotNexus.Gateway.Tests/GatewayHostResolveSessionTypeTests.cs NEW behavioural test file. 9 tests pin every row of the migration truth table including the persisted-AgentSubAgent preservation regression.

Critique sweep — 3 background agents

Following the standing multi-model critique policy. All three completed before commit-amendment.

Reviewer Model Verdict Adopted
security-review gpt-5.5 PASS — no vulnerabilities (7 focus areas: privilege escalation, registry poisoning, TOCTOU, info disclosure, optional DI failure mode, behaviour drift vs #556, test surface) n/a
code-review plan-vs-impl claude-opus-4.7 (after 4.6 returned 401) ✅ DoD #1 & #2 PASS · ⚠️ 3 SHOULD-CONSIDERs 2 of 3 adopted: added missing baseline truth-table row, renamed fragile literal. Skipped: dropping the "no registry injected" test (subtly different DI scenario from "null returned from registry")
rubber-duck bug-hunt gpt-5.3-codex 🔴 1 HIGH · 🟡 1 MEDIUM · 🟢 1 LOW All 3 adopted

HIGH adopted (production-bug-prevention)

Persisted AgentSubAgent sessions could be silently downgraded to UserAgent on next dispatch when the sub-agent descriptor had been Unregister()'d (sub-agent completion via DefaultSubAgentManager.cs:557) or was transiently absent after a gateway restart. Downstream effects include flipping Session.IsInteractive (Session.cs:35), SessionWarmupService.cs:183-190 visibility, and PreCompactionMemoryFlusher.cs:33-36 / SessionEndMemoryFlusher.cs:34-36 gating.

Fix: threaded isNewSession into ResolveSessionType and added a fail-closed branch that preserves AgentSubAgent on EXISTING sessions when no typed signal contradicts. The branch deliberately requires isNewSession=false so fresh dispatches do NOT re-open a substring back door — the #554 architecture-fence invariant stays intact.

Regression test: DispatchAsync_WhenSessionAlreadyPersistedAsAgentSubAgent_AndRegistryHasNoDescriptor_PreservesAgentSubAgent seeds an existing AgentSubAgent session via GetOrCreateAsync, then re-dispatches with a null registry and asserts the type is preserved.

LOW adopted

ChannelKey.From("cron") allocated per dispatch → cached as private static readonly CronChannelKey.

Plan-vs-impl adopted

  • Added baseline truth-table row: DispatchAsync_WhenRegistryReturnsNamedDescriptor_AndSessionIdIsPlain_StampsAsUserAgent.
  • Renamed fragile literal "session-with-subagent-descriptor""plain-session-id-1" so the SubAgent classification can only come from the typed descriptor, not from any accidental substring match.

Truth table (pinned by tests)

Descriptor.Kind SessionId shape Channel isNewSession Expected SessionType Test
SubAgent plain web true AgentSubAgent …ReturnsSubAgentDescriptor_StampsSessionTypeAsAgentSubAgent
Named ::subagent:: infix web true UserAgent (typed signal trumps substring) …ReturnsNamedDescriptor_AndSessionIdHasSubAgentInfix_StampsAsUserAgent ⭐ critical pin
null (registry returns null) ::subagent:: infix web true UserAgent (no substring fallback) …RegistryHasNoDescriptor_AndSessionIdHasSubAgentInfix_StampsAsUserAgent
null (no registry injected) ::subagent:: infix web true UserAgent (no substring fallback) …RegistryNotProvided_AndSessionIdHasSubAgentInfix_StampsAsUserAgent
null (deregister race) any any false + persisted=AgentSubAgent AgentSubAgent (preserved) …SessionAlreadyPersistedAsAgentSubAgent_…_PreservesAgentSubAgent ⭐ regression pin
Named plain web true UserAgent (baseline) …ReturnsNamedDescriptor_AndSessionIdIsPlain_StampsAsUserAgent
Named Soul shape web true Soul …WhenSessionIdIsSoul_StampsAsSoul
Named plain cron true Cron …WhenChannelTypeIsCron_StampsAsCron
SubAgent plain cron true AgentSubAgent (precedence) …WhenChannelTypeIsCron_AndDescriptorKindIsSubAgent_PrefersSubAgentBucketing

Validation

  • Build: 0 warnings, 0 errors under TreatWarningsAsErrors.
  • GatewayHostResolveSessionTypeTests: 9/9 passed (7 original + 2 added during critique fold-in).
  • BotNexus.Gateway.Tests: 1815/1816 passed (1 skipped, 0 failed).
  • Architecture fence (AgentKindArchitectureTests): 49/49 passed.
  • Full solution suite: ~4,600 passed / 41 skipped / 0 failed.

Follow-up

Closes #554

sytone and others added 2 commits May 26, 2026 12:12
…nd signal (Phase 5 / F-6 step 2 — closes #554)

Replaces the legacy `session.SessionId.IsSubAgent` substring parse in
`GatewayHost.ResolveSessionType` with a typed lookup via `IAgentRegistry`:
`descriptor.Kind == AgentKind.SubAgent`. When the registry has no descriptor
for the agent (transient deregister race or test composition) the host
defaults to `SessionType.UserAgent` rather than falling back to the substring
— the substring path is now fully cut from this method.

The `IAgentRegistry` ctor dependency is optional (nullable, defaults to
null) so existing test composition roots continue to compile unchanged.
Production DI already registers `DefaultAgentRegistry` (verified in
`GatewayServiceCollectionExtensions.cs:112`) so the runtime path always
gets the typed signal.

Behaviour pinned by 7 new tests in `GatewayHostResolveSessionTypeTests.cs`
covering each row of the truth table:
- SubAgent descriptor → AgentSubAgent
- Named descriptor + sub-agent-shaped SessionId → UserAgent (typed signal trumps substring)
- No descriptor + sub-agent-shaped SessionId → UserAgent (no fallback to substring)
- No registry at all + sub-agent-shaped SessionId → UserAgent
- Soul SessionId → Soul (regression pin)
- Cron channel type → Cron (regression pin)
- SubAgent descriptor + cron channel → AgentSubAgent (precedence pin)

Architecture fence updated: `GatewayHost.cs` allowlist entry removed from
`AgentKindArchitectureTests`; the remaining DEFERRED entry is
`SessionsController` (#555) which needs a persisted `SessionType` discriminator
on the session row before its read-side filter can be cut.

Closes #554.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ster; cache cron ChannelKey

Critique sweep amendments for the #554 ResolveSessionType migration.

Adopted from bug-hunt critique (gpt-5.3-codex):

- HIGH: persisted AgentSubAgent sessions could be silently downgraded to
  UserAgent on next dispatch when the sub-agent descriptor had been
  Unregister()'d (sub-agent completion) or was transiently absent after
  a gateway restart. This would flip Session.IsInteractive and warmup /
  memory-flush gating downstream.
  Fix: thread isNewSession into ResolveSessionType and preserve
  AgentSubAgent on EXISTING sessions when no typed signal contradicts.
  This deliberately requires isNewSession=false so fresh dispatches
  do NOT re-open a substring back door — the #554 architecture-fence
  invariant stays intact.

- MEDIUM: test suite missed the downgrade regression path.
  Fix: added DispatchAsync_WhenSessionAlreadyPersistedAsAgentSubAgent_
  AndRegistryHasNoDescriptor_PreservesAgentSubAgent — seeds an existing
  AgentSubAgent session via GetOrCreateAsync, then re-dispatches with a
  null registry and asserts the type is preserved.

- LOW: ChannelKey.From("cron") allocated on every dispatch.
  Fix: cached as private static readonly CronChannelKey.

Adopted from plan-vs-impl critique (claude-opus-4.7):

- Missing truth-table baseline row.
  Fix: added DispatchAsync_WhenRegistryReturnsNamedDescriptor_
  AndSessionIdIsPlain_StampsAsUserAgent.

- Fragile test literal containing the word 'subagent'.
  Fix: renamed 'session-with-subagent-descriptor' to 'plain-session-id-1'
  so the SubAgent classification can only come from the descriptor.

Skipped from plan-vs-impl critique:

- 'Drop redundant test at L104'. The two tests at L84 / L104 pin subtly
  different DI scenarios (null returned from registry vs. registry not
  injected at all). Both worth keeping.

Security review (gpt-5.5): PASS — no vulnerabilities, no changes needed.

Validation:
- Build: 0 warnings, 0 errors under TreatWarningsAsErrors.
- GatewayHostResolveSessionTypeTests: 9/9 passed (7 original + 2 new).
- BotNexus.Gateway.Tests: 1815/1816 passed (1 skipped).
- Architecture fence: 49/49 passed.
- Full solution suite: ~4,600 passed / 41 skipped / 0 failed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@sytone sytone merged commit 5b56783 into main May 26, 2026
10 checks passed
@sytone sytone deleted the feat/resolve-session-type-by-kind branch May 26, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Gateway] Migrate GatewayHost.ResolveSessionType off SessionId.IsSubAgent substring (Phase 5 follow-up)

1 participant