Skip to content

fix(routing): re-fire external-MCP nudge periodically, not once per session#593

Merged
mksglu merged 1 commit into
mksglu:nextfrom
ousamabenyounes:fix/external-mcp-periodic-nudge
May 16, 2026
Merged

fix(routing): re-fire external-MCP nudge periodically, not once per session#593
mksglu merged 1 commit into
mksglu:nextfrom
ousamabenyounes:fix/external-mcp-periodic-nudge

Conversation

@ousamabenyounes
Copy link
Copy Markdown
Collaborator

@ousamabenyounes ousamabenyounes commented May 16, 2026

What

Replace the one-shot external-MCP PreToolUse nudge (#529) with a periodic one. The guidance — "wrap large external-MCP payloads in ctx_execute / ctx_fetch_and_index" — now re-fires every N matching tool calls (default 10, tunable via CONTEXT_MODE_EXTERNAL_MCP_NUDGE_EVERY).

Why

Reported as a follow-up on #567 by @Rajan-p-simform: with context-mode enabled but using the Jira MCP tool, token usage did not decrease. The user's session_analysis.md shows 55 Jira searches and input tokens growing 24K → 75K per turn — each full Jira response stayed in conversation history. Context compaction kicked in repeatedly just to keep the session alive.

Tracing the cause: hooks/core/routing.mjs:816 registers the external-MCP nudge via guidanceOnce("external-mcp", …). After the first matching MCP call in a session the nudge never re-fires. By call ~10–15 the original advisory has been compacted out of the model's recent window, and the agent stops piping Jira/Slack/Notion responses through ctx_execute. The remaining 40+ calls return full payloads straight into context.

The original guidanceOnce design works fine for Bash/Read/Grep — those guide tool category selection (one reminder is enough). External-MCP is different: each call brings new payload pressure, and the agent needs the reminder near the call site, not just at the start of the session.

How

  • New guidancePeriodic(type, content, sessionId, period) helper in hooks/core/routing.mjs — fires on the 1st, (period+1)th, (2·period+1)th… matching call. Counter is in-memory (same-process) + file-backed <guidanceDir>/<type>.count (cross-process), so hook invocations from the same logical session share state. On any IO/parse failure the helper falls back to firing — losing a counter is preferable to silently dropping the advisory.
  • External-MCP branch now calls guidancePeriodic("external-mcp", …, getExternalMcpNudgeEvery()). Default cadence: every 10 calls. Env override: CONTEXT_MODE_EXTERNAL_MCP_NUDGE_EVERY, range [1, 100], invalid values fall back to 10.
  • resetGuidanceThrottle now clears the in-memory counter Map as well as the existing _guidanceShown Set + marker dirs.
  • Bash/Read/Grep nudges keep their one-shot semantics — explicitly out of scope.
  • README "Routing-guidance environment variables" table documents the new env var (doc-sync gate per ~/.claude/opensource-kb/mksglu-context-mode.md).

Coverage added

tests/hooks/core-routing.test.ts:

  • The previous respects the once-per-session guidance throttle test was rewritten — it locked in exactly the behavior this PR changes.
  • re-fires guidance every N calls (default cadence = 10) — drives 22 matching MCP calls, asserts guidance fires on indices 0, 10, 20 and nothing else.
  • honors CONTEXT_MODE_EXTERNAL_MCP_NUDGE_EVERY to tune cadence — sets env to 3, drives 7 calls, asserts fires on indices 0, 3, 6.
  • falls back to default cadence on invalid env values — covers "0", "-1", "9999", "not-a-number", "".
  • resetGuidanceThrottle resets the periodic counter — confirms a fresh session restarts from tick 1.

Test plan

  • npx vitest run tests/hooks/core-routing.test.ts80 passed
  • npm run typecheck → clean
  • npm test3342 passed / 47 skipped. Two [vitest-pool]: Timeout terminating forks worker warnings on kiro-hooks.test.ts and statusline-sqlite.test.ts reproduce on plain upstream/next without this patch — pre-existing worker-teardown flakes, not caused by this change. All tests within those files pass.
  • npm run build → clean (no bundle diffs committed; per repo etiquette, cli.bundle.mjs / server.bundle.mjs rebuild in CI).

Before / after (Jira-style session, default cadence)

Tool call # Before this PR After this PR
1 guidance fires guidance fires
2–10 silent silent
11 silent guidance re-fires
12–20 silent silent
21 silent guidance re-fires
22–55 silent re-fires at 31, 41, 51

Net guidance cost over a 55-call session: ~6 firings × ~250 tokens ≈ 1.5K tokens — small compared to the 50K+ saved by the model actually piping large Jira responses through ctx_execute for the rest of the session.

Out of scope (intentional)

  • PostToolUse-side response interception (replacing large MCP responses with FTS5-stored summaries). That's a much bigger architectural change and would need maintainer alignment before any code lands.
  • Per-MCP-server granularity (Jira/Slack/Notion currently share the same external-mcp marker). Easy to extend later if the cadence ever needs per-server tuning.

…ession

mksglu#529 added a one-shot PreToolUse guidance for external MCP tools
(slack/jira/notion/gdrive…) telling the agent to wrap large payloads in
ctx_execute. In MCP-heavy sessions (mksglu#567 follow-up reports 55 Jira calls,
context growing 24K → 75K tokens) the single nudge is dropped by context
compaction and later tool responses flood the conversation unchecked.

Replace `guidanceOnce("external-mcp", …)` with a new `guidancePeriodic`
helper that fires on calls 1, N+1, 2·N+1, … with N defaulting to 10.
Tunable via `CONTEXT_MODE_EXTERNAL_MCP_NUDGE_EVERY` (range [1, 100],
invalid values fall back to 10). Bash/Read/Grep nudges keep their
one-shot behavior — they advise on tool *category* usage, not per-call
flood pressure.

Counter is in-memory + file-backed (`<guidanceDir>/<type>.count`) so it
stays coherent across hook process invocations within the same session.
On any IO/parse failure we fire (lose a counter rather than silently drop
the advisory).

Test plan
- npx vitest run tests/hooks/core-routing.test.ts → 80 passed
- npm run typecheck → clean
- New tests cover: default cadence over 22 calls, env-tuned cadence,
  invalid env value fallback, resetGuidanceThrottle clears the counter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mksglu mksglu merged commit 752d523 into mksglu:next May 16, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants