feat(health): add XState connection health monitor with per-URL state#382
feat(health): add XState connection health monitor with per-URL state#382
Conversation
Add per-instance connection lifecycle management using XState v5: - State machine: connecting → healthy | degraded | disconnected | failed - Auto-reconnect with exponential backoff (5s → 60s) and ±10% jitter - Bounded startup via GITLAB_INIT_TIMEOUT_MS (default 5s) - Health probes: unauthenticated HEAD /api/v4/version with skipAuth + rateLimit bypass - Probe reachability: only 200/401/403/405 count as reachable (404/400 = misconfigured) - Error classification: transient → auto-reconnect, auth/permanent → failed state - InitializationTimeoutError class for robust timeout detection (replaces string matching) - Late terminal bootstrap errors (auth/permanent) forwarded through timedOut guard - Per-URL InstanceState in ConnectionManager with Map<string, InstanceState> - normalizeInstanceUrl: strip API suffixes, default ports, canonicalize via URL constructor - parseTimerMs: clamp all timer env vars to MAX_SAFE_TIMEOUT_MS (2^31-1) - skipAuth option on enhancedFetch: suppresses auth headers + cookies case-insensitively - Context tools bypass tier/version filtering via registry check - bootstrapComplete flag replaces isConnected() for CONNECTION_FAILED guard - Disconnected mode: only manage_context tools exposed, tools/list_changed broadcast - Accurate error messages for OAuth-deferred mode (getInstanceInfo/getSchemaInfo) - TokenScopeDetector helpers extracted, aggregateFilterCounters with reason→counter map - postProcessRelatedReferences collapsed to single loop Closes #364
- HealthMonitor: state transitions, reconnect backoff, health probes, callbacks - Handlers: CONNECTION_FAILED with connecting/failed states, OAuth URL routing, bootstrap timeout with clearInflight, timedOut guard suppressing late reports, OAuth-scoped bootstrap-timeout test, never-settling timeout mocks - ConnectionManager: per-URL state, typed helpers (internals/cmStatic/cmInternal), getRegisteredHandler, zero as-any in enhanced tests - RegistryManager: context tool bypass, per-URL filter stats, description overrides - URL: normalizeInstanceUrl with default port stripping, API suffix edge cases - ToolAvailability: conservative >= 15.0 fallback for unknown tools
- Upgrade yarn 4.12.0 → 4.13.0 (binary + Dockerfile corepack prepare) - Bump hono 4.11.7 → 4.12.10 (CVE fix) - Update all remaining dependencies to latest versions
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 5 minutes and 44 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds per-instance connection resilience: a new XState-based HealthMonitor with init timeout and exponential backoff/jitter, error classification and structured connection-failure payloads; refactors ConnectionManager/RegistryManager for per-URL InstanceState and URL normalization; handler bootstrap gating and disconnected fast-paths; config, fetch, server timeout clamping, tests, docs, and packaging bumps. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler
participant HealthMonitor
participant ConnectionManager
participant GitLab
Client->>Handler: request (tools/list or callTool)
Handler->>HealthMonitor: ensure monitoring / initialize(instanceUrl?)
HealthMonitor->>HealthMonitor: spawn actor -> connecting
HealthMonitor->>ConnectionManager: initialize(instanceUrl) / ensureIntrospected
par Connect & introspect
ConnectionManager->>GitLab: auth / introspection probes
and Init timeout guard
HealthMonitor->>HealthMonitor: enforce INIT_TIMEOUT_MS (Promise.race)
end
alt Initialized & healthy
GitLab-->>ConnectionManager: introspection success
ConnectionManager-->>HealthMonitor: initialized
HealthMonitor->>Handler: onStateChange -> healthy
Handler-->>Client: full tool list / perform tool
else Timeout or transient failures
HealthMonitor->>HealthMonitor: transition -> disconnected/degraded
HealthMonitor->>HealthMonitor: schedule reconnect (backoff + jitter)
HealthMonitor-->>Handler: onStateChange -> disconnected
Handler-->>Client: manage_context only or CONNECTION_FAILED response
else Auth/config permanent error
HealthMonitor->>HealthMonitor: transition -> failed
HealthMonitor-->>Handler: onStateChange -> failed
Handler-->>Client: manage_context only / CONNECTION_FAILED (permanent)
end
rect rgba(100,150,200,0.5)
Note over HealthMonitor,GitLab: Periodic health checks and tool success/failure reporting influence state transitions
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Test Coverage ReportOverall Coverage: 96.48%
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Introduces a per-instance connection health subsystem (XState v5) to prevent startup hangs when GitLab is unreachable, and to gate tool exposure/behavior based on per-URL reachability. This fits the MCP server’s reliability layer by making tools/list and tool calls resilient to connectivity and authentication failures.
Changes:
- Add
HealthMonitor(new service) with per-URL state machines, bounded initialization, periodic probes, and auto-reconnect backoff. - Refactor
ConnectionManagerto store isolated per-URL instance state (client/introspection/scopes) with normalized URL keys and inflight deduplication. - Update handlers/registry filtering to support disconnected mode (
manage_context-only exposure) and improve error classification + structuredCONNECTION_FAILEDresponses.
Reviewed changes
Copilot reviewed 23 out of 25 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| yarn.lock | Updates dependencies (notably adds xstate, bumps SDK, undici, yarn, etc.). |
| package.json | Aligns dependency versions and adds xstate; updates Yarn version pin. |
| Dockerfile | Updates Yarn version usage and ensures Yarn release artifact is copied for builds. |
| .yarnrc.yml | Pins Yarn via yarnPath to 4.13.0 release file. |
| README.md | Documents connection resilience behavior and new env vars. |
| src/config.ts | Adds timer parsing/clamping helper and new health-monitor env vars (init timeout, backoff, probes, thresholds). |
| src/utils/url.ts | Adds normalizeInstanceUrl() for canonical per-instance URL keys. |
| src/utils/fetch.ts | Adds skipAuth option for intentionally unauthenticated probes (health/version). |
| src/utils/error-handler.ts | Adds connection-health error classification and a structured CONNECTION_FAILED error payload helper. |
| src/services/HealthMonitor.ts | New XState-based per-URL health monitor (timeouts, probes, reconnect, callbacks). |
| src/services/ConnectionManager.ts | Moves to per-URL InstanceState Map, normalizes keys, adds init dedup + stale-init protections, improves OAuth probe behavior. |
| src/services/ToolAvailability.ts | Adds isToolAvailableForInstance() to avoid singleton/current-instance leakage. |
| src/services/TokenScopeDetector.ts | Accepts explicit base URL, normalizes URLs, and refactors expiry/status handling. |
| src/registry-manager.ts | Adds reachability-aware tool exposure (context-only mode) and per-URL filter stats; refactors cache build pipeline. |
| src/handlers.ts | Integrates health gating, structured connection failures, bounded bootstrap/timeout behavior, and per-request URL selection/normalization. |
| src/server.ts | Ensures health monitor is shut down during graceful shutdown. |
| tests/unit/utils/url.test.ts | Adds unit tests for URL normalization (slashes, API suffixes, default ports, subpaths). |
| tests/unit/services/HealthMonitor.test.ts | Adds extensive HealthMonitor state-machine and error classification tests. |
| tests/unit/services/ToolAvailability.test.ts | Adds per-instance availability tests for tier/version gating. |
| tests/unit/services/ToolDescriptionOverrides.test.ts | Updates mocks for new per-instance availability method. |
| tests/unit/services/ConnectionManagerEnhanced.test.ts | Updates tests for per-URL state, OAuth probe via enhancedFetch(skipAuth), cache behaviors, and concurrency protections. |
| tests/unit/services/ConnectionManager.test.ts | Updates unit tests for per-URL state, deduped introspection, and reinitialize rollback behavior. |
| tests/unit/RegistryManager.test.ts | Updates tests for disconnected mode behavior, per-URL stats, and new availability mocking. |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/services/ConnectionManager.test.ts (1)
120-140: 🧹 Nitpick | 🔵 TrivialAssert that
refreshTokenScopes()forwardsTEST_URLto scope detection.These cases now seed per-instance state, but the mock never checks its argument. If the implementation regresses to the global base URL, this block still passes.
Minimal assertion to lock in the per-URL contract
it('should return false when token detection fails (returns null)', async () => { injectInstanceState(manager, TEST_URL); mockDetectTokenScopes.mockResolvedValue(null); const result = await manager.refreshTokenScopes(); expect(result).toBe(false); + expect(mockDetectTokenScopes).toHaveBeenCalledWith(TEST_URL); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/services/ConnectionManager.test.ts` around lines 120 - 140, Update the tests so they assert that refreshTokenScopes() forwards the per-instance base URL to the scope detection mock: after calling manager.refreshTokenScopes() in the relevant specs (the ones that call injectInstanceState(manager, TEST_URL)), add an assertion that mockDetectTokenScopes was called with TEST_URL (e.g., expect(mockDetectTokenScopes).toHaveBeenCalledWith(TEST_URL) or equivalent). This ensures ConnectionManager.refreshTokenScopes() uses the instance URL rather than a global base URL; reference the functions ConnectionManager.refreshTokenScopes, injectInstanceState, and the mock mockDetectTokenScopes to locate where to add the assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@package.json`:
- Line 476: The ProxyAgent constructor call is passing an unsupported connect
option which will crash under undici v8; locate the new undici.ProxyAgent({...})
invocation (the one that includes uri, requestTls, HEADERS_TIMEOUT_MS,
BODY_TIMEOUT_MS) and remove the connect: { timeout: CONNECT_TIMEOUT_MS } entry,
or instead move that timeout logic into the TLS/proxy config (e.g. apply
CONNECT_TIMEOUT_MS via requestTls or other supported agent/pool options) so only
supported options (uri, requestTls, headersTimeout, bodyTimeout) are passed to
ProxyAgent.
In `@src/config.ts`:
- Around line 345-352: The assignment that sets server.headersTimeout by adding
5000 to HTTP_KEEPALIVE_TIMEOUT_MS can overflow Node's 32-bit timer ceiling
(MAX_SAFE_TIMEOUT_MS); update the code that computes server.headersTimeout (the
place referencing HTTP_KEEPALIVE_TIMEOUT_MS and server.headersTimeout) to clamp
the result to MAX_SAFE_TIMEOUT_MS (reuse parseTimerMs or Math.min with
MAX_SAFE_TIMEOUT_MS) so the sum (HTTP_KEEPALIVE_TIMEOUT_MS + 5000) never exceeds
the MAX_SAFE_TIMEOUT_MS; apply the same clamping logic to the other similar
assignment mentioned (lines referencing headersTimeout at the other location) so
both places guard against timer overflow.
In `@src/handlers.ts`:
- Around line 512-516: The connection-info enrichment currently calls
connectionManager.getInstanceInfo(effectiveInstanceUrl) unguarded after setting
bootstrapComplete, which can throw and abort before executeTool() runs; make
this best-effort by moving the getInstanceInfo/logInfo logic behind the
LOG_FORMAT==='verbose' check (so it doesn't run in condensed mode) and wrap the
call in a try/catch that swallows/logs errors (do not rethrow or change
bootstrapComplete), calling logInfo only if getInstanceInfo returns
successfully; update references to bootstrapComplete,
connectionManager.getInstanceInfo, effectiveInstanceUrl, LOG_FORMAT, logInfo,
and executeTool accordingly.
In `@src/services/TokenScopeDetector.ts`:
- Around line 186-197: The computeDaysUntilExpiry function currently only checks
for NaN but allows out-of-range dates (e.g., month 13) because Date.UTC
normalizes them; update computeDaysUntilExpiry to validate parsed components:
ensure month is between 1 and 12 and day is at least 1 and no greater than the
maximum days for that month (accounting for leap years for February) before
calling Date.UTC; if any component is out of range return null so invalid inputs
like "2026-13-40" produce null rather than a normalized future date.
In `@src/services/ToolAvailability.ts`:
- Around line 487-509: The availability logic in isToolAvailableForInstance
duplicates the same action lookup, version gate, tier check, and unknown-tool
fallback already implemented in isToolAvailable; extract that shared predicate
into a single private helper (e.g., private static
toolAvailabilityPredicate/toolAvailabilityFor({toolName, instanceInfo, action}))
that encapsulates getActionRequirement, parseVersion checks, isTierSufficient
and the '>=15.0' fallback, then have both isToolAvailable and
isToolAvailableForInstance call that helper so the rules are centralized and
won’t drift; keep existing behavior and return types unchanged.
In `@tests/unit/utils/url.test.ts`:
- Around line 97-110: Add a unit test that exercises normalizeInstanceUrl's
try/catch fallback for malformed inputs by calling normalizeInstanceUrl with a
clearly malformed URL (e.g., 'not a url' or 'http://[::1') and assert the
function returns the graceful fallback value (e.g., the original input or the
documented sanitized fallback) to ensure the catch branch around new URL() is
covered; place this assertion alongside the existing canonical-port cases in the
same test suite so the malformed input path is directly covered.
---
Outside diff comments:
In `@tests/unit/services/ConnectionManager.test.ts`:
- Around line 120-140: Update the tests so they assert that refreshTokenScopes()
forwards the per-instance base URL to the scope detection mock: after calling
manager.refreshTokenScopes() in the relevant specs (the ones that call
injectInstanceState(manager, TEST_URL)), add an assertion that
mockDetectTokenScopes was called with TEST_URL (e.g.,
expect(mockDetectTokenScopes).toHaveBeenCalledWith(TEST_URL) or equivalent).
This ensures ConnectionManager.refreshTokenScopes() uses the instance URL rather
than a global base URL; reference the functions
ConnectionManager.refreshTokenScopes, injectInstanceState, and the mock
mockDetectTokenScopes to locate where to add the assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 572cc540-02fb-4415-9259-14d80ed8cc7d
⛔ Files ignored due to path filters (2)
.yarn/releases/yarn-4.13.0.cjsis excluded by!**/.yarn/**yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (23)
.yarnrc.ymlDockerfileREADME.mdpackage.jsonsrc/config.tssrc/handlers.tssrc/registry-manager.tssrc/server.tssrc/services/ConnectionManager.tssrc/services/HealthMonitor.tssrc/services/TokenScopeDetector.tssrc/services/ToolAvailability.tssrc/utils/error-handler.tssrc/utils/fetch.tssrc/utils/url.tstests/unit/RegistryManager.test.tstests/unit/handlers.test.tstests/unit/services/ConnectionManager.test.tstests/unit/services/ConnectionManagerEnhanced.test.tstests/unit/services/HealthMonitor.test.tstests/unit/services/ToolAvailability.test.tstests/unit/services/ToolDescriptionOverrides.test.tstests/unit/utils/url.test.ts
…best-effort log - Remove unsupported connect option from ProxyAgent (Undici v8 incompatibility) - Clamp server.headersTimeout to MAX_SAFE_TIMEOUT_MS to prevent 32-bit overflow - Make getInstanceInfo log enrichment best-effort (try/catch in verbose mode) - Add month/day range validation in computeDaysUntilExpiry - Add malformed URL fallback test, assert refreshTokenScopes URL forwarding - Fix ProxyAgent test to not expect connect option
|
@coderabbitai Re: outside-diff ConnectionManager.test.ts:120-140 — Fixed: mock now forwards args via spread ( |
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/server.test.ts (1)
95-108:⚠️ Potential issue | 🟡 MinorMirror this new config export in the later re-mocks too.
The top-level mock is updated, but the
jest.doMock('../../src/config', ...)factories later in this file replace the whole module and still omitMAX_SAFE_TIMEOUT_MS. Any dual-mode re-import from those blocks will seeundefinedhere and driveheadersTimeouttoNaN.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/server.test.ts` around lines 95 - 108, The later jest.doMock('../../src/config', ...) factories in tests/unit/server.test.ts omit the MAX_SAFE_TIMEOUT_MS export, causing dual-mode re-imports to get undefined and produce NaN for headersTimeout; update every jest.doMock('../../src/config', ...) factory to include MAX_SAFE_TIMEOUT_MS (value 2_147_483_647) alongside the other mocked keys (SSE, STREAMABLE_HTTP, HOST, PORT, SSE_HEARTBEAT_MS, HTTP_KEEPALIVE_TIMEOUT_MS, packageName, packageVersion, LOG_FORMAT, LOG_FILTER, shouldSkipAccessLogRequest) so the mocked module shape matches the top-level jest.mock and prevents NaN in headersTimeout.
♻️ Duplicate comments (1)
src/services/TokenScopeDetector.ts (1)
186-198:⚠️ Potential issue | 🟡 MinorReject calendar-invalid expiry dates too.
2026-02-31and non-leap2025-02-29still pass the1..31guard, andDate.UTC()normalizes them into March. This helper therefore still returns a day count instead ofnullfor invalid input.Suggested fix
function computeDaysUntilExpiry(expiresAt: string | null): number | null { if (!expiresAt) return null; const [yearStr, monthStr, dayStr] = expiresAt.split('-'); const year = Number(yearStr); const month = Number(monthStr); const day = Number(dayStr); if (Number.isNaN(year) || Number.isNaN(month) || Number.isNaN(day)) return null; if (month < 1 || month > 12 || day < 1 || day > 31) return null; - const expiryUtcMs = Date.UTC(year, month - 1, day); + const expiry = new Date(Date.UTC(year, month - 1, day)); + if ( + expiry.getUTCFullYear() !== year || + expiry.getUTCMonth() !== month - 1 || + expiry.getUTCDate() !== day + ) { + return null; + } + const expiryUtcMs = expiry.getTime(); const now = new Date(); const todayUtcMs = Date.UTC(now.getUTCFullYear(), now.getUTCMonth(), now.getUTCDate()); return Math.ceil((expiryUtcMs - todayUtcMs) / (1000 * 60 * 60 * 24)); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/TokenScopeDetector.ts` around lines 186 - 198, computeDaysUntilExpiry currently accepts syntactically valid but calendar-invalid dates like "2026-02-31" because Date.UTC normalizes overflowed days; update computeDaysUntilExpiry to detect and reject calendar-invalid dates by constructing the UTC date (using Date.UTC(year, month-1, day) or new Date(...)) and then verifying the resulting UTC year, month and date match the parsed year, month-1 and day; if they don't match return null before computing expiryUtcMs and remaining days.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/handlers.ts`:
- Around line 707-713: Replace the plain Error sent to
HealthMonitor.getInstance().reportError with the typed
InitializationTimeoutError so the HealthMonitor classifies this as an
init-timeout; specifically, in the timeout branch that currently calls
reportError(requestInstanceUrl, new Error(`Initialization timed out after
${HANDLER_TIMEOUT_MS}ms`)), construct and pass new
InitializationTimeoutError(`Initialization timed out after
${HANDLER_TIMEOUT_MS}ms`) (and add the necessary import for
InitializationTimeoutError), leaving the subsequent
ConnectionManager.getInstance().clearInflight(requestInstanceUrl) call
unchanged.
In `@src/server.ts`:
- Around line 203-205: server.headersTimeout can become equal to
server.keepAliveTimeout when HTTP_KEEPALIVE_TIMEOUT_MS equals
MAX_SAFE_TIMEOUT_MS, which drops the 5s safety margin; change the assignment to
compute a desiredHeaders = Math.min(HTTP_KEEPALIVE_TIMEOUT_MS + 5000,
MAX_SAFE_TIMEOUT_MS) and then ensure it's strictly greater than
HTTP_KEEPALIVE_TIMEOUT_MS (e.g., if desiredHeaders <= HTTP_KEEPALIVE_TIMEOUT_MS
set desiredHeaders = Math.min(HTTP_KEEPALIVE_TIMEOUT_MS + 1,
MAX_SAFE_TIMEOUT_MS)), then assign server.headersTimeout = desiredHeaders;
reference server.keepAliveTimeout, server.headersTimeout,
HTTP_KEEPALIVE_TIMEOUT_MS and MAX_SAFE_TIMEOUT_MS when making the change.
---
Outside diff comments:
In `@tests/unit/server.test.ts`:
- Around line 95-108: The later jest.doMock('../../src/config', ...) factories
in tests/unit/server.test.ts omit the MAX_SAFE_TIMEOUT_MS export, causing
dual-mode re-imports to get undefined and produce NaN for headersTimeout; update
every jest.doMock('../../src/config', ...) factory to include
MAX_SAFE_TIMEOUT_MS (value 2_147_483_647) alongside the other mocked keys (SSE,
STREAMABLE_HTTP, HOST, PORT, SSE_HEARTBEAT_MS, HTTP_KEEPALIVE_TIMEOUT_MS,
packageName, packageVersion, LOG_FORMAT, LOG_FILTER, shouldSkipAccessLogRequest)
so the mocked module shape matches the top-level jest.mock and prevents NaN in
headersTimeout.
---
Duplicate comments:
In `@src/services/TokenScopeDetector.ts`:
- Around line 186-198: computeDaysUntilExpiry currently accepts syntactically
valid but calendar-invalid dates like "2026-02-31" because Date.UTC normalizes
overflowed days; update computeDaysUntilExpiry to detect and reject
calendar-invalid dates by constructing the UTC date (using Date.UTC(year,
month-1, day) or new Date(...)) and then verifying the resulting UTC year, month
and date match the parsed year, month-1 and day; if they don't match return null
before computing expiryUtcMs and remaining days.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 89a611ed-0e20-47bf-859e-f43a15de1159
📒 Files selected for processing (10)
src/config.tssrc/handlers.tssrc/server.tssrc/services/TokenScopeDetector.tssrc/utils/fetch.tstests/unit/handlers.test.tstests/unit/server.test.tstests/unit/services/ConnectionManager.test.tstests/unit/utils/fetch-config.test.tstests/unit/utils/url.test.ts
…te validation - Use InitializationTimeoutError in handler timeout path (was plain Error) - Clamp keepAliveTimeout to MAX_SAFE_TIMEOUT_MS - 5000 to preserve mandatory headersTimeout gap at the 32-bit timer ceiling - Validate calendar-correctness of expiry dates via Date roundtrip (rejects Feb 31, Feb 29 on non-leap years) - Add MAX_SAFE_TIMEOUT_MS to all 6 jest.doMock config re-mocks in server tests - Re-export InitializationTimeoutError through HealthMonitor mock in handler tests
|
@coderabbitai Review body items addressed: Duplicate — TokenScopeDetector.ts:186-198: Fixed — added Date roundtrip validation to reject calendar-invalid dates (Feb 31, Feb 29 on non-leap years). Outside-diff — server.test.ts:95-108: Fixed — added |
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/server.test.ts (1)
95-108:⚠️ Potential issue | 🟡 MinorAdd assertions for the new clamp and shutdown paths.
Updating the mocks keeps this suite loading, but it still doesn't exercise the new
configureServerTimeouts()boundary behavior or the newHealthMonitorbranch ingracefulShutdown(). A regression in either code path would still pass here today.As per coding guidelines, "Use Jest for all tests and ensure all features have corresponding test coverage".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/server.test.ts` around lines 95 - 108, The test currently mocks config but doesn't assert the new timeout clamping or the alternate gracefulShutdown flow; update tests to exercise configureServerTimeouts() by setting config values to trigger clamping (e.g., set HTTP_KEEPALIVE_TIMEOUT_MS and SSE_HEARTBEAT_MS above/below MAX_SAFE_TIMEOUT_MS) and assert the resulting server timeout values are clamped as expected, and exercise gracefulShutdown() by mocking or instantiating HealthMonitor to return both healthy/unhealthy paths (or stub its methods) and assert the shutdown branch behaviors (calls made, timers cleared, or different exit logic). Use the existing jest.mock for config and add spies/mocks for HealthMonitor and the functions configureServerTimeouts() and gracefulShutdown() to verify the new control-flow branches are executed and their side effects asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/handlers.test.ts`:
- Around line 1465-1468: The test currently resets the mock and calls
setupHandlers(mockServer) a second time but never asserts the retry actually
happened; update the test to assert that mockHealthMonitor.initialize was
invoked again (e.g.,
expect(mockHealthMonitor.initialize).toHaveBeenCalledTimes(2) or
expect(mockHealthMonitor.initialize).toHaveBeenCalledWith(...) after the second
call) so the retry path from setupHandlers is verified; locate the assertions
around mockHealthMonitor.initialize and setupHandlers in the failing test and
add the call-count assertion immediately after the second await.
---
Outside diff comments:
In `@tests/unit/server.test.ts`:
- Around line 95-108: The test currently mocks config but doesn't assert the new
timeout clamping or the alternate gracefulShutdown flow; update tests to
exercise configureServerTimeouts() by setting config values to trigger clamping
(e.g., set HTTP_KEEPALIVE_TIMEOUT_MS and SSE_HEARTBEAT_MS above/below
MAX_SAFE_TIMEOUT_MS) and assert the resulting server timeout values are clamped
as expected, and exercise gracefulShutdown() by mocking or instantiating
HealthMonitor to return both healthy/unhealthy paths (or stub its methods) and
assert the shutdown branch behaviors (calls made, timers cleared, or different
exit logic). Use the existing jest.mock for config and add spies/mocks for
HealthMonitor and the functions configureServerTimeouts() and gracefulShutdown()
to verify the new control-flow branches are executed and their side effects
asserted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f4f8c0ea-27d4-4dba-b273-b27f4dd44a0f
📒 Files selected for processing (5)
src/handlers.tssrc/server.tssrc/services/TokenScopeDetector.tstests/unit/handlers.test.tstests/unit/server.test.ts
…izeInstanceUrl - Replace InstanceRegistry's private normalizeUrl() with shared normalizeInstanceUrl from utils/url.ts — eliminates default-port divergence between ConnectionManager and InstanceRegistry map keys - Update url.ts docstring to reflect consolidated usage - Sync README.md.in with README.md (Connection Resilience section was missing from template) - Clarify disconnected mode: tools remain listed during active reconnect (connecting state) - Add initialize call-count assertion in retry test - Add keepAlive clamp boundary test (MAX_SAFE_TIMEOUT_MS - 5000)
|
@coderabbitai Re: outside-diff server.test.ts:95-108 — Added keepAlive clamp boundary test exercising MAX_SAFE_TIMEOUT_MS ceiling. HealthMonitor.shutdown() coverage exists in HealthMonitor.test.ts. |
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/services/ConnectionManagerEnhanced.test.ts (1)
795-810:⚠️ Potential issue | 🟡 MinorWrap jest.resetModules() and jest.doMock() setup inside try/finally blocks to ensure cleanup runs even if an assertion fails.
These tests mutate Jest's module registry and singleton state. The cleanup currently runs only after the final assertion, so if an expectation throws first, the temporary module graph persists and can cause cascade failures in subsequent tests.
🧪 Suggested pattern
it('should detect version via unauthenticated fetch in OAuth mode (enterprise instance)', async () => { + try { jest.resetModules(); jest.doMock('../../../src/config', () => ({ GITLAB_BASE_URL: 'https://oauth-gitlab.example.com', GITLAB_TOKEN: null, })); // ... remaining doMock calls ... const manager = OAuthConnectionManager.getInstance(); await manager.initialize(); expect(manager.getVersion()).toBe('17.3.0'); expect(manager.getTier()).toBe('premium'); expect(mockEnhancedFetchEnterprise).toHaveBeenCalledWith( expect.stringContaining('/api/v4/version'), expect.objectContaining({ skipAuth: true }), ); + } finally { + manager.reset(); + jest.resetModules(); + } - manager.reset(); - jest.resetModules(); });Apply the same pattern to lines 813–867 (free instance), 868–909 (network error), and 962–1081 (InstanceRegistry cache) tests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/services/ConnectionManagerEnhanced.test.ts` around lines 795 - 810, The test mutates Jest's module registry and singleton state without guaranteed cleanup; wrap the jest.resetModules() and corresponding jest.doMock()/module setup logic in a try/finally so cleanup always runs even if assertions fail — locate the blocks around OAuthConnectionManager.getInstance()/manager.initialize() and the calls to jest.doMock()/jest.resetModules() (same pattern also for the free instance block lines 813–867, network error 868–909, and InstanceRegistry cache 962–1081) and move the cleanup calls (manager.reset() and jest.resetModules()) into the finally section to ensure module registry and singleton state are always restored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config.ts`:
- Around line 345-352: The parseTimerMs helper (and all count-based env parsers
that currently call Number.parseInt) must stop accepting partial numeric
strings; replace the parseInt-based parsing with a strict check that the env
value fully matches an unsigned integer (e.g. /^\d+$/) before converting with
Number and using it, otherwise return the provided fallback; keep the current
ceiling logic using MAX_SAFE_TIMEOUT_MS and ensure zero/negative or non-matching
values fall back to the fallback.
In `@src/services/ConnectionManager.ts`:
- Around line 376-385: The assignment uses a double-cast for features
(cachedIntrospection.features as unknown as GitLabInstanceInfo['features']);
change the CachedIntrospection definition so its features property is typed the
same as GitLabInstanceInfo['features'] (or a compatible union/type), then remove
the double-cast in the state.instanceInfo assignment (leave the single explicit
types for tier and schemaInfo assignments intact). Update
CachedIntrospection.features’ type and any upstream places that populate it so
the direct assignment in ConnectionManager (state.instanceInfo = { ... features:
cachedIntrospection.features ... }) no longer requires casting.
In `@src/services/TokenScopeDetector.ts`:
- Around line 173-184: The per-instance token logs need to include the instance
URL so failures/summaries are attributable; modify
logTokenSelfIntrospectionError to accept a url parameter (e.g.,
logTokenSelfIntrospectionError(status: number, url: string)) and add the url to
every log call’s metadata (e.g., include { url } with logDebug/logInfo), and
similarly thread the same url field into the expiry-warning metadata and the
per-instance scope summary paths (the other logging functions that emit expiry
warnings and scope summaries) so both success and failure logs include the
instance URL for multi-instance refreshes.
In `@src/utils/fetch.ts`:
- Around line 154-160: The ProxyAgent currently sets proxyTls.timeout but not
the upstream handshake timeout in requestTls, so update the ProxyAgent creation
(function/file uses ProxyAgent, requestTls, proxyTls, tlsOptions,
CONNECT_TIMEOUT_MS, undici) to apply CONNECT_TIMEOUT_MS to requestTls as
well—e.g., merge or add timeout: CONNECT_TIMEOUT_MS into the object passed as
requestTls (while preserving existing tlsOptions), so both the proxy CONNECT
phase and the post-CONNECT upstream TLS handshake are bounded by the same
timeout.
In `@tests/unit/services/HealthMonitor.test.ts`:
- Around line 806-813: After starting HealthMonitor.initialize(TEST_URL) create
a microtask flush and assert the actor was created before checking health: flush
XState microtasks (e.g., await a Promise-based microtask flush helper or call
your test suite's flushPromises), then assert the monitor has an actor (e.g.,
monitor.actors.size > 0 or equivalent public accessor) to ensure the actor
reached the 'connecting' state, and only then call
monitor.isAnyInstanceHealthy() to verify it returns true; reference
HealthMonitor.getInstance, HealthMonitor.initialize, and isAnyInstanceHealthy
when making the assertion change.
- Around line 468-485: Test has a race where XState schedules the actor invoke
as a microtask before mockInitialize is reached; after calling
monitor.initialize(TEST_URL) twice but before resolving the mocked init promise,
flush the microtask queue so the XState invoke runs (e.g., await
Promise.resolve() or await new Promise(r => queueMicrotask(r))) before calling
resolveInit(); update the test around HealthMonitor.getInstance().initialize /
mockInitialize to insert this microtask flush (or a shared flushPromises helper)
so mockInitialize is actually invoked before you resolve the pending init.
In `@tests/unit/utils/url.test.ts`:
- Around line 112-116: Add a regression test in tests/unit/utils/url.test.ts
that exercises the post-new-URL() suffix-stripping branch of
normalizeInstanceUrl by calling normalizeInstanceUrl with inputs containing
query/fragment parts (e.g., 'https://host/api/v4?x=1' and
'https://host/api/v4#frag') and assert their results match the normalized result
for the same base URL without query/fragment (e.g.,
expect(normalizeInstanceUrl('https://host/api/v4?x=1')).toBe(normalizeInstanceUrl('https://host/api/v4'))),
so the query/fragment normalization branch in normalizeInstanceUrl is covered.
---
Outside diff comments:
In `@tests/unit/services/ConnectionManagerEnhanced.test.ts`:
- Around line 795-810: The test mutates Jest's module registry and singleton
state without guaranteed cleanup; wrap the jest.resetModules() and corresponding
jest.doMock()/module setup logic in a try/finally so cleanup always runs even if
assertions fail — locate the blocks around
OAuthConnectionManager.getInstance()/manager.initialize() and the calls to
jest.doMock()/jest.resetModules() (same pattern also for the free instance block
lines 813–867, network error 868–909, and InstanceRegistry cache 962–1081) and
move the cleanup calls (manager.reset() and jest.resetModules()) into the
finally section to ensure module registry and singleton state are always
restored.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 58f53268-909c-4a88-ba83-901d01946432
⛔ Files ignored due to path filters (2)
.yarn/releases/yarn-4.13.0.cjsis excluded by!**/.yarn/**yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (27)
.yarnrc.ymlDockerfileREADME.mdREADME.md.inpackage.jsonsrc/config.tssrc/handlers.tssrc/registry-manager.tssrc/server.tssrc/services/ConnectionManager.tssrc/services/HealthMonitor.tssrc/services/InstanceRegistry.tssrc/services/TokenScopeDetector.tssrc/services/ToolAvailability.tssrc/utils/error-handler.tssrc/utils/fetch.tssrc/utils/url.tstests/unit/RegistryManager.test.tstests/unit/handlers.test.tstests/unit/server.test.tstests/unit/services/ConnectionManager.test.tstests/unit/services/ConnectionManagerEnhanced.test.tstests/unit/services/HealthMonitor.test.tstests/unit/services/ToolAvailability.test.tstests/unit/services/ToolDescriptionOverrides.test.tstests/unit/utils/fetch-config.test.tstests/unit/utils/url.test.ts
…ssages - parseTimerMs rejects partial numeric strings via /^\d+$/ regex (parseInt accepted "120s" → 120, "1e3" → 1) - ProxyAgent requestTls gets timeout for upstream TLS handshake (was only on proxyTls for proxy tunnel) - Include instanceUrl in health check and token introspection error messages - Update ToolAvailability docstring: version=unknown allows all tools - Flush XState microtasks in dedup and connecting-state tests - Add query/fragment normalization regression test for normalizeInstanceUrl
|
@coderabbitai Re: outside-diff ConnectionManagerEnhanced.test.ts:795-810 — Won't add try/finally around jest.resetModules — this is a test infrastructure pattern used across 50+ tests in this file. Refactoring would be a separate cleanup PR. |
|
Tip For best results, initiate chat on the files or code changes.
( ˘•ω•˘ ) ✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/handlers.test.ts (1)
189-210: 🧹 Nitpick | 🔵 TrivialCover concurrent
setupHandlers()deduplication.The retry path is covered, but the main contract of
healthMonitorStartup ??=is concurrent-session deduplication. Without aPromise.all(...)case, a regression back to per-sessioninitialize()/onStateChange()calls would still pass here.Suggested test
+ it('should share startup work across concurrent setupHandlers calls', async () => { + const otherServer = { + setRequestHandler: jest.fn(), + } as unknown as jest.Mocked<Server>; + + await Promise.all([setupHandlers(mockServer), setupHandlers(otherServer)]); + + expect(mockHealthMonitor.initialize).toHaveBeenCalledTimes(1); + expect(mockHealthMonitor.onStateChange).toHaveBeenCalledTimes(1); + expect(mockServer.setRequestHandler).toHaveBeenCalledTimes(2); + expect(otherServer.setRequestHandler).toHaveBeenCalledTimes(2); + });As per coding guidelines,
tests/**/*.test.{ts,tsx}: Use Jest for all tests and ensure all features have corresponding test coverage.Also applies to: 1476-1490
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/handlers.test.ts` around lines 189 - 210, Add a test that verifies concurrent calls to setupHandlers are deduplicated by running multiple simultaneous setupHandlers (e.g., via Promise.all) and asserting mockHealthMonitor.initialize and mockHealthMonitor.onStateChange are called exactly once; ensure the test still validates both handlers were registered (use getRegisteredHandler with ListToolsRequestSchema and CallToolRequestSchema) so the concurrent path exercises the same registration logic as the single-call test.
♻️ Duplicate comments (5)
src/services/ToolAvailability.ts (1)
488-510:⚠️ Potential issue | 🟠 MajorMirror the
'unknown'-version fast path inisToolAvailable()too.Line 495 now treats
version === 'unknown'as “defer filtering”, but the singleton-backed path still uses the old gates. That meansgetAvailableTools()/filterToolsByAvailability()can return a different tool set thanisToolAvailableForInstance()for the same instance state.🔧 Suggested follow-up
try { const instanceInfo = connectionManager.getInstanceInfo(); + if (instanceInfo.version === 'unknown') { + return true; + } const actionReq = this.getActionRequirement(toolName, action); if (actionReq) { const version = parseVersion(instanceInfo.version); if (version < parseVersion(actionReq.minVersion)) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/ToolAvailability.ts` around lines 488 - 510, The singleton-backed availability check in isToolAvailable should mirror the fast-path used in isToolAvailableForInstance: if instanceInfo.version === 'unknown' return true (i.e., defer filtering) so getAvailableTools/filterToolsByAvailability yield the same set as isToolAvailableForInstance for the same instance state; update isToolAvailable to check instanceInfo.version === 'unknown' before applying the conservative >=15.0 gate and keep references to isTierSufficient/getActionRequirement/action requirements behavior intact.src/services/TokenScopeDetector.ts (1)
224-274:⚠️ Potential issue | 🟡 MinorInclude the instance URL in the remaining per-instance token logs.
detectTokenScopes()now resolves a normalized per-instance URL, but the validation/network-error logs and the expiry/scope-summary branches inlogTokenScopeInfo()still emit no instance identifier. In multi-instance refreshes, those messages are hard to attribute to the failing GitLab host.🔍 Suggested change
export async function detectTokenScopes(baseUrl?: string): Promise<TokenScopeInfo | null> { const url = normalizeInstanceUrl(baseUrl ?? GITLAB_BASE_URL); @@ const parsed = TokenSelfResponseSchema.safeParse(raw); if (!parsed.success) { logDebug('Token self-introspection response validation failed', { + url, error: parsed.error.message, }); return null; } @@ } catch (error) { logDebug('Token scope detection failed (network error)', { + url, error: error instanceof Error ? error.message : String(error), }); return null; } } @@ export function logTokenScopeInfo( info: TokenScopeInfo, totalTools: number, baseUrl: string = GITLAB_BASE_URL, ): void { + const url = normalizeInstanceUrl(baseUrl); const availableTools = getToolsForScopes(info.scopes); const scopeList = info.scopes.join(', '); @@ logWarn(`Token "${info.name}" has expired! Please create a new token.`, { + url, tokenName: info.name, expiresAt: info.expiresAt, }); @@ logInfo(`Token "${info.name}" detected`, { + url, tokenName: info.name, scopes: scopeList, expiresAt: info.expiresAt ?? 'never', });Also applies to: 340-393
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/TokenScopeDetector.ts` around lines 224 - 274, The logging currently omits the normalized instance URL, making per-instance failures ambiguous; update detectTokenScopes() to include the resolved url in all logs (pass or embed url into calls such as logTokenSelfIntrospectionError(response.status, url) already present, and also include url in the catch logDebug call and in the calls to logTokenScopeInfo/any expiry/scope-summary logging) so that logDebug('Token scope detection failed (network error)') and any logTokenScopeInfo branches emit the instance identifier (use the url variable or pass it as an extra field/argument to those logging helpers).src/utils/fetch.ts (1)
154-163:⚠️ Potential issue | 🟠 MajorAlways set
requestTls.timeouton the proxy path.
ProxyAgentomitsAgentOptions.connect, and bothrequestTlsandproxyTlsare theConnectOptionshooks where Undici exposestimeout. BecauserequestTlsbecomesundefinedwhenhasTlsConfigis false, HTTPS-over-proxy requests lose the upstream TLS-handshake timeout and can exceedCONNECT_TIMEOUT_MS. BuildrequestTlsunconditionally and mergetlsOptionsinto it instead. (github.com)🔧 Suggested change
return new undici.ProxyAgent({ uri: proxyUrl, - requestTls: hasTlsConfig ? { ...tlsOptions, timeout: CONNECT_TIMEOUT_MS } : undefined, + requestTls: { ...tlsOptions, timeout: CONNECT_TIMEOUT_MS }, proxyTls: { timeout: CONNECT_TIMEOUT_MS }, headersTimeout: HEADERS_TIMEOUT_MS, bodyTimeout: BODY_TIMEOUT_MS, });For Undici `ProxyAgent`, do `requestTls` and `proxyTls` both extend `Client#ConnectOptions` (including `timeout`), and is `AgentOptions.connect` intentionally omitted from `ProxyAgentOptions`?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/fetch.ts` around lines 154 - 163, The ProxyAgent usage conditionally omits requestTls when hasTlsConfig is false, which drops the upstream TLS handshake timeout; always build requestTls and set its timeout to CONNECT_TIMEOUT_MS while merging tlsOptions when present so both requestTls and proxyTls include timeout. Update the undici.ProxyAgent constructor call (symbols: ProxyAgent, requestTls, proxyTls, tlsOptions, CONNECT_TIMEOUT_MS, hasTlsConfig) to create a requestTls object unconditionally that spreads tlsOptions if available and always includes timeout.src/config.ts (1)
411-463:⚠️ Potential issue | 🟠 MajorHarden the count-based env parsers too.
parseTimerMs()now rejects partial numerics, but these settings still useNumber.parseInt(). Inputs like1e3,100req, or0x10will be silently truncated and can collapse failure thresholds, pool sizes, retry counts, or rate limits to the wrong value. Route them through a shared strict integer parser instead, while preserving the current>= 0behavior forAPI_RETRY_MAX_ATTEMPTS.🔧 Suggested hardening
+function parseStrictInt(envValue: string | undefined): number | null { + const raw = envValue?.trim(); + if (!raw || !/^\d+$/.test(raw)) { + return null; + } + const parsed = Number(raw); + return Number.isSafeInteger(parsed) ? parsed : null; +} + +function parsePositiveInt(envValue: string | undefined, fallback: number): number { + const parsed = parseStrictInt(envValue); + return parsed !== null && parsed > 0 ? parsed : fallback; +} + +function parseNonNegativeInt(envValue: string | undefined, fallback: number): number { + const parsed = parseStrictInt(envValue); + return parsed !== null && parsed >= 0 ? parsed : fallback; +} + -const parsedFailureThreshold = Number.parseInt(process.env.GITLAB_FAILURE_THRESHOLD ?? '3', 10); -export const FAILURE_THRESHOLD = - Number.isFinite(parsedFailureThreshold) && parsedFailureThreshold > 0 - ? parsedFailureThreshold - : 3; +export const FAILURE_THRESHOLD = parsePositiveInt(process.env.GITLAB_FAILURE_THRESHOLD, 3); -const parsedPoolMaxConnections = Number.parseInt( - process.env.GITLAB_POOL_MAX_CONNECTIONS ?? '25', - 10, -); -export const POOL_MAX_CONNECTIONS = - Number.isFinite(parsedPoolMaxConnections) && parsedPoolMaxConnections > 0 - ? parsedPoolMaxConnections - : 25; +export const POOL_MAX_CONNECTIONS = parsePositiveInt( + process.env.GITLAB_POOL_MAX_CONNECTIONS, + 25, +); -const parsedMaxAttempts = Number.parseInt(process.env.GITLAB_API_RETRY_MAX_ATTEMPTS ?? '3', 10); -export const API_RETRY_MAX_ATTEMPTS = - Number.isFinite(parsedMaxAttempts) && parsedMaxAttempts >= 0 ? parsedMaxAttempts : 3; +export const API_RETRY_MAX_ATTEMPTS = parseNonNegativeInt( + process.env.GITLAB_API_RETRY_MAX_ATTEMPTS, + 3, +); -const parsedIpMaxRequests = Number.parseInt(process.env.RATE_LIMIT_IP_MAX_REQUESTS ?? '100', 10); -export const RATE_LIMIT_IP_MAX_REQUESTS = - Number.isFinite(parsedIpMaxRequests) && parsedIpMaxRequests > 0 ? parsedIpMaxRequests : 100; +export const RATE_LIMIT_IP_MAX_REQUESTS = parsePositiveInt( + process.env.RATE_LIMIT_IP_MAX_REQUESTS, + 100, +); -const parsedSessionMaxRequests = Number.parseInt( - process.env.RATE_LIMIT_SESSION_MAX_REQUESTS ?? '300', - 10, -); -export const RATE_LIMIT_SESSION_MAX_REQUESTS = - Number.isFinite(parsedSessionMaxRequests) && parsedSessionMaxRequests > 0 - ? parsedSessionMaxRequests - : 300; +export const RATE_LIMIT_SESSION_MAX_REQUESTS = parsePositiveInt( + process.env.RATE_LIMIT_SESSION_MAX_REQUESTS, + 300, +);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config.ts` around lines 411 - 463, Replace all usages of Number.parseInt for count-based env vars with a shared strict integer parser (e.g., create a helper like parseStrictInt) that rejects partial numerics (so inputs like "1e3", "100req", "0x10" are invalid); then use that helper to parse GITLAB_FAILURE_THRESHOLD -> FAILURE_THRESHOLD, GITLAB_POOL_MAX_CONNECTIONS -> POOL_MAX_CONNECTIONS, GITLAB_API_RETRY_MAX_ATTEMPTS -> API_RETRY_MAX_ATTEMPTS (preserving the existing >= 0 acceptance), RATE_LIMIT_IP_MAX_REQUESTS -> RATE_LIMIT_IP_MAX_REQUESTS, and RATE_LIMIT_SESSION_MAX_REQUESTS -> RATE_LIMIT_SESSION_MAX_REQUESTS, keeping the same fallback defaults and >0 (or >=0 for API_RETRY_MAX_ATTEMPTS) validation logic. Ensure the helper returns NaN or throws on invalid input so the ternary checks remain correct and add unit tests for parseStrictInt and each constant to cover partial-numeric inputs.tests/unit/services/HealthMonitor.test.ts (1)
480-486:⚠️ Potential issue | 🟡 MinorUse the suite’s XState flush helper here instead of a single promise tick.
await Promise.resolve()is weaker thanflushXStateMicrotasks(). In the concurrent-init test it can raceresolveInit()before the actor reachesmockInitialize(), and in the connecting test the assertion can still pass via the empty-actors fast path. Use the shared helper in both places, and keep the second spec asserting that the actor exists before checkingisAnyInstanceHealthy().♻️ Suggested test hardening
- await Promise.resolve(); + await flushXStateMicrotasks(); ... - await Promise.resolve(); + await flushXStateMicrotasks(); + expect(monitor.getMonitoredInstances()).toContain(TEST_URL);Also applies to: 813-817
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/services/HealthMonitor.test.ts` around lines 480 - 486, Replace the weak microtask synchronization calls in the tests by using the suite helper flushXStateMicrotasks() instead of await Promise.resolve(): call flushXStateMicrotasks() after invoking monitor.initialize(TEST_URL) (both in the concurrent-init test and the connecting test locations) so the actor invoke reliably reaches mockInitialize before resolveInit() is called; keep the second spec’s assertion that the actor exists (after monitor.initialize and flushXStateMicrotasks()) before asserting monitor.isAnyInstanceHealthy(), and ensure references to resolveInit(), mockInitialize, monitor.initialize, flushXStateMicrotasks, and isAnyInstanceHealthy are updated accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/handlers.test.ts`:
- Around line 704-731: Add a companion reachable-path assertion to the existing
"should allow manage_context through when disconnected" test: after the
unreachable assertions, set
mockHealthMonitor.isInstanceReachable.mockReturnValue(true) and
mockHealthMonitor.getState.mockReturnValue('healthy'), stub
mockRegistryManager.executeTool.mockResolvedValue({ context: 'info' }) for the
reachable case, call callToolHandler with the same manage_context/whoami params,
and assert the call succeeds and mockHealthMonitor.reportSuccess was called (and
reportError was not); this ensures manage_context goes through the normal
health-reporting path when the instance is reachable.
In `@tests/unit/services/HealthMonitor.test.ts`:
- Around line 587-595: The test currently only asserts that mockFetch ran;
update the assertion to verify the quick-health-check request shape by asserting
that mockFetch was called with an options object containing method: 'HEAD' and
skipAuth: true (e.g., expect(mockFetch).toHaveBeenCalledWith(expect.any(String),
expect.objectContaining({ method: 'HEAD', skipAuth: true }))), targeting the
call that probes TEST_URL after initMonitor()/quickHealthCheck(); apply the same
change to the similar spec around lines 822-836 so both tests validate the
unauthenticated HEAD probe shape.
---
Outside diff comments:
In `@tests/unit/handlers.test.ts`:
- Around line 189-210: Add a test that verifies concurrent calls to
setupHandlers are deduplicated by running multiple simultaneous setupHandlers
(e.g., via Promise.all) and asserting mockHealthMonitor.initialize and
mockHealthMonitor.onStateChange are called exactly once; ensure the test still
validates both handlers were registered (use getRegisteredHandler with
ListToolsRequestSchema and CallToolRequestSchema) so the concurrent path
exercises the same registration logic as the single-call test.
---
Duplicate comments:
In `@src/config.ts`:
- Around line 411-463: Replace all usages of Number.parseInt for count-based env
vars with a shared strict integer parser (e.g., create a helper like
parseStrictInt) that rejects partial numerics (so inputs like "1e3", "100req",
"0x10" are invalid); then use that helper to parse GITLAB_FAILURE_THRESHOLD ->
FAILURE_THRESHOLD, GITLAB_POOL_MAX_CONNECTIONS -> POOL_MAX_CONNECTIONS,
GITLAB_API_RETRY_MAX_ATTEMPTS -> API_RETRY_MAX_ATTEMPTS (preserving the existing
>= 0 acceptance), RATE_LIMIT_IP_MAX_REQUESTS -> RATE_LIMIT_IP_MAX_REQUESTS, and
RATE_LIMIT_SESSION_MAX_REQUESTS -> RATE_LIMIT_SESSION_MAX_REQUESTS, keeping the
same fallback defaults and >0 (or >=0 for API_RETRY_MAX_ATTEMPTS) validation
logic. Ensure the helper returns NaN or throws on invalid input so the ternary
checks remain correct and add unit tests for parseStrictInt and each constant to
cover partial-numeric inputs.
In `@src/services/TokenScopeDetector.ts`:
- Around line 224-274: The logging currently omits the normalized instance URL,
making per-instance failures ambiguous; update detectTokenScopes() to include
the resolved url in all logs (pass or embed url into calls such as
logTokenSelfIntrospectionError(response.status, url) already present, and also
include url in the catch logDebug call and in the calls to logTokenScopeInfo/any
expiry/scope-summary logging) so that logDebug('Token scope detection failed
(network error)') and any logTokenScopeInfo branches emit the instance
identifier (use the url variable or pass it as an extra field/argument to those
logging helpers).
In `@src/services/ToolAvailability.ts`:
- Around line 488-510: The singleton-backed availability check in
isToolAvailable should mirror the fast-path used in isToolAvailableForInstance:
if instanceInfo.version === 'unknown' return true (i.e., defer filtering) so
getAvailableTools/filterToolsByAvailability yield the same set as
isToolAvailableForInstance for the same instance state; update isToolAvailable
to check instanceInfo.version === 'unknown' before applying the conservative
>=15.0 gate and keep references to isTierSufficient/getActionRequirement/action
requirements behavior intact.
In `@src/utils/fetch.ts`:
- Around line 154-163: The ProxyAgent usage conditionally omits requestTls when
hasTlsConfig is false, which drops the upstream TLS handshake timeout; always
build requestTls and set its timeout to CONNECT_TIMEOUT_MS while merging
tlsOptions when present so both requestTls and proxyTls include timeout. Update
the undici.ProxyAgent constructor call (symbols: ProxyAgent, requestTls,
proxyTls, tlsOptions, CONNECT_TIMEOUT_MS, hasTlsConfig) to create a requestTls
object unconditionally that spreads tlsOptions if available and always includes
timeout.
In `@tests/unit/services/HealthMonitor.test.ts`:
- Around line 480-486: Replace the weak microtask synchronization calls in the
tests by using the suite helper flushXStateMicrotasks() instead of await
Promise.resolve(): call flushXStateMicrotasks() after invoking
monitor.initialize(TEST_URL) (both in the concurrent-init test and the
connecting test locations) so the actor invoke reliably reaches mockInitialize
before resolveInit() is called; keep the second spec’s assertion that the actor
exists (after monitor.initialize and flushXStateMicrotasks()) before asserting
monitor.isAnyInstanceHealthy(), and ensure references to resolveInit(),
mockInitialize, monitor.initialize, flushXStateMicrotasks, and
isAnyInstanceHealthy are updated accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ce71b882-845d-4fd4-a530-39ff19a2df3e
📒 Files selected for processing (8)
src/config.tssrc/services/HealthMonitor.tssrc/services/TokenScopeDetector.tssrc/services/ToolAvailability.tssrc/utils/fetch.tstests/unit/handlers.test.tstests/unit/services/HealthMonitor.test.tstests/unit/utils/url.test.ts
…or messages - Extract parseStrictInt helper with /^\d+$/ regex, replace all 5 remaining parseInt usages (FAILURE_THRESHOLD, POOL_MAX_CONNECTIONS, retry/rate counts) - Include resolvedUrl in getInstanceInfo/getSchemaInfo error messages - Add unknown-version fast path to isToolAvailable (mirrors isToolAvailableForInstance) - Include url in token scope detection validation/network error logs - Assert health check request shape (URL + method HEAD)
|
@coderabbitai All duplicate and outside-diff items from this review addressed:
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 25278
Two quick housekeeping items:
✏️ Learnings added
🧠 Learnings used |
…back intent - Add companion test: reachable manage_context goes through bootstrap, reports health - Expand OAuth fallback code explanation: intentional for startup health checks and non-OAuth transports, safe because GITLAB_BASE_URL always configured, #379 tracking
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/utils/fetch.ts (1)
157-163:⚠️ Potential issue | 🟠 MajorAlways apply
CONNECT_TIMEOUT_MStorequestTlsin proxy mode.On Line 162,
requestTlsis only set whenhasTlsConfigis true. That leaves proxied upstream TLS handshake unbounded byCONNECT_TIMEOUT_MSwhenever no custom TLS options are configured.🔧 Proposed fix
- requestTls: hasTlsConfig ? { ...tlsOptions, timeout: CONNECT_TIMEOUT_MS } : undefined, + requestTls: { ...(hasTlsConfig ? tlsOptions : {}), timeout: CONNECT_TIMEOUT_MS }, proxyTls: { timeout: CONNECT_TIMEOUT_MS },In Undici ProxyAgent (v8+), when `requestTls` is omitted, what timeout applies to the upstream TLS handshake after CONNECT, and does `proxyTls.timeout` cover that upstream handshake phase?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/fetch.ts` around lines 157 - 163, The ProxyAgent configuration currently omits requestTls when hasTlsConfig is false, leaving the upstream TLS handshake after CONNECT unbounded; ensure requestTls is always set in the new undici.ProxyAgent call so CONNECT_TIMEOUT_MS is applied regardless of custom TLS options — i.e., set requestTls to an object that always includes timeout: CONNECT_TIMEOUT_MS and merge tlsOptions when hasTlsConfig is true (keep proxyTls: { timeout: CONNECT_TIMEOUT_MS } as is); update the configuration where ProxyAgent is constructed (look for ProxyAgent, requestTls, hasTlsConfig, tlsOptions, CONNECT_TIMEOUT_MS) so upstream TLS connections after CONNECT are bounded.tests/unit/services/HealthMonitor.test.ts (1)
587-599:⚠️ Potential issue | 🟡 MinorPin the unauthenticated probe options in these assertions.
These specs still pass if
quickHealthCheck()regresses to an authenticated or rate-limited request, because they only verify the URL/method or thatmockFetchran. Please assert theskipAuthandrateLimitflags here as well.💡 Suggested assertion updates
expect(mockFetch).toHaveBeenCalledWith( `${TEST_URL}/api/v4/version`, - expect.objectContaining({ method: 'HEAD' }), + expect.objectContaining({ + method: 'HEAD', + skipAuth: true, + rateLimit: false, + retry: false, + }), ); @@ - expect(mockFetch).toHaveBeenCalled(); + expect(mockFetch).toHaveBeenCalledWith( + `${TEST_URL}/api/v4/version`, + expect.objectContaining({ + method: 'HEAD', + skipAuth: true, + rateLimit: false, + retry: false, + }), + );As per coding guidelines,
tests/**/*.test.{ts,tsx}: Use Jest for all tests and ensure all features have corresponding test coverage.Also applies to: 824-840
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/services/HealthMonitor.test.ts` around lines 587 - 599, The test for health-checked connected state doesn't assert that the unauthenticated probe options are used, so regressions in quickHealthCheck() that add authentication or rate-limiting could still pass; update the assertions in the 'should verify with health check when already connected' test (which calls initMonitor() and checks monitor.getState(TEST_URL)) to expect mockFetch was called with `${TEST_URL}/api/v4/version` and an options object that contains method: 'HEAD' plus skipAuth: true and rateLimit: false (or the exact flag names used by quickHealthCheck()), and add the same stronger assertion pattern to the other affected spec range (lines ~824-840) to ensure the probe is unauthenticated and not rate-limited.src/services/ConnectionManager.ts (1)
452-458:⚠️ Potential issue | 🟠 MajorRemove the
currentInstanceUrlfallback from these request-scoped getters before merge.This is still a live cross-request leak: when callers omit
instanceUrl, these APIs silently read whichever URL last woncurrentInstanceUrl. Under concurrent OAuth traffic that can return another request’s client/version/tier and drive availability or routing decisions against the wrong instance. Please thread an explicit URL from request context through the remaining zero-arg callers, or fail fast wheninstanceUrlis omitted in multi-instance/OAuth paths.Based on learnings, remaining no-arg callers in
ToolAvailability,WidgetAvailability,workItemTypes, anderror-handlerstill rely on this fallback, which can leak cross-request state under concurrent OAuth traffic.Also applies to: 474-476, 528-613
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/ConnectionManager.ts` around lines 452 - 458, The resolveState method currently falls back to the mutable currentInstanceUrl which leaks cross-request state; change resolveState(instanceUrl?: string) to require an explicit instanceUrl (remove the currentInstanceUrl fallback) and throw a clear error when instanceUrl is omitted (or when multi-instance/OAuth mode is enabled) so callers must pass the correct URL; then update all zero-arg callers (ToolAvailability, WidgetAvailability, workItemTypes and the error-handler usages) to thread the explicit request-scoped URL into their calls (or propagate the error to fail fast) and ensure normalizeInstanceUrl is applied by the caller before calling resolveState.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/services/ToolAvailability.ts`:
- Around line 450-452: getRestrictedParameters currently applies tier/version
checks even when instanceInfo.version === 'unknown', causing gated fields to be
stripped during REST-only/OAuth-deferred flows; mirror the behavior in
isToolAvailable/isToolAvailableForInstance by adding an early-return that skips
parameter restriction and returns the original/unrestricted parameters when
instanceInfo.version === 'unknown'. Locate getRestrictedParameters in
ToolAvailability.ts and add the same guard (check instanceInfo.version ===
'unknown') before any tier/version filtering logic so tools remain visible with
full schemas until version detection completes; ensure this change matches the
behavior added to isToolAvailable and isToolAvailableForInstance.
In `@tests/unit/handlers.test.ts`:
- Around line 704-731: Add assertions to the test to ensure the disconnected
manage_context fast-path never touches connection/bootstrap APIs: after parsing
the result and before restoring mocks, assert that mockRegistryManager.getClient
and mockRegistryManager.ensureIntrospected were not called (e.g.
expect(mockRegistryManager.getClient).not.toHaveBeenCalled() and
expect(mockRegistryManager.ensureIntrospected).not.toHaveBeenCalled()). This
ensures the handler used the local fast-path and did not invoke GitLab client or
introspection logic.
---
Duplicate comments:
In `@src/services/ConnectionManager.ts`:
- Around line 452-458: The resolveState method currently falls back to the
mutable currentInstanceUrl which leaks cross-request state; change
resolveState(instanceUrl?: string) to require an explicit instanceUrl (remove
the currentInstanceUrl fallback) and throw a clear error when instanceUrl is
omitted (or when multi-instance/OAuth mode is enabled) so callers must pass the
correct URL; then update all zero-arg callers (ToolAvailability,
WidgetAvailability, workItemTypes and the error-handler usages) to thread the
explicit request-scoped URL into their calls (or propagate the error to fail
fast) and ensure normalizeInstanceUrl is applied by the caller before calling
resolveState.
In `@src/utils/fetch.ts`:
- Around line 157-163: The ProxyAgent configuration currently omits requestTls
when hasTlsConfig is false, leaving the upstream TLS handshake after CONNECT
unbounded; ensure requestTls is always set in the new undici.ProxyAgent call so
CONNECT_TIMEOUT_MS is applied regardless of custom TLS options — i.e., set
requestTls to an object that always includes timeout: CONNECT_TIMEOUT_MS and
merge tlsOptions when hasTlsConfig is true (keep proxyTls: { timeout:
CONNECT_TIMEOUT_MS } as is); update the configuration where ProxyAgent is
constructed (look for ProxyAgent, requestTls, hasTlsConfig, tlsOptions,
CONNECT_TIMEOUT_MS) so upstream TLS connections after CONNECT are bounded.
In `@tests/unit/services/HealthMonitor.test.ts`:
- Around line 587-599: The test for health-checked connected state doesn't
assert that the unauthenticated probe options are used, so regressions in
quickHealthCheck() that add authentication or rate-limiting could still pass;
update the assertions in the 'should verify with health check when already
connected' test (which calls initMonitor() and checks
monitor.getState(TEST_URL)) to expect mockFetch was called with
`${TEST_URL}/api/v4/version` and an options object that contains method: 'HEAD'
plus skipAuth: true and rateLimit: false (or the exact flag names used by
quickHealthCheck()), and add the same stronger assertion pattern to the other
affected spec range (lines ~824-840) to ensure the probe is unauthenticated and
not rate-limited.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 696d6b34-f65e-471f-af74-ba346dd6c6f6
📒 Files selected for processing (9)
src/config.tssrc/services/ConnectionManager.tssrc/services/HealthMonitor.tssrc/services/TokenScopeDetector.tssrc/services/ToolAvailability.tssrc/utils/fetch.tstests/unit/handlers.test.tstests/unit/services/HealthMonitor.test.tstests/unit/utils/url.test.ts
|
|
⛔ Learnings removed🧠 Learnings used✅ Actions performedReview triggered.
|
|
@coderabbitai Duplicate items from latest review:
|
…lways set requestTls timeout - getRestrictedParameters returns [] for version=unknown (consistent with isToolAvailable/isToolAvailableForInstance) — both cached and singleton paths - requestTls always includes CONNECT_TIMEOUT_MS even without custom TLS config to bound upstream TLS handshake in proxy mode - Assert disconnected manage_context skips getClient/ensureIntrospected
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
No other dupes outstanding from this round on my side. Nice cleanup. 🏁 Script executed: Length of output: 2536
Proposed minimal fix (keeps existing merges, but always supplies a timeout object): --- a/src/utils/fetch.ts
+++ b/src/utils/fetch.ts
@@
return new undici.ProxyAgent({
uri: proxyUrl,
- // requestTls: upstream TLS handshake after CONNECT tunnel established
- // proxyTls: TLS handshake to the proxy itself (if HTTPS proxy)
- // Both need connect timeout to fully bound HTTPS-over-proxy connections.
- requestTls: hasTlsConfig ? { ...tlsOptions, timeout: CONNECT_TIMEOUT_MS } : undefined,
+ // requestTls: upstream TLS handshake after CONNECT tunnel established
+ // proxyTls: TLS handshake to the proxy itself (if HTTPS proxy)
+ // Both need connect timeout to fully bound HTTPS-over-proxy connections.
+ requestTls: tlsOptions
+ ? { ...tlsOptions, timeout: CONNECT_TIMEOUT_MS }
+ : { timeout: CONNECT_TIMEOUT_MS },
proxyTls: { timeout: CONNECT_TIMEOUT_MS },
headersTimeout: HEADERS_TIMEOUT_MS,
bodyTimeout: BODY_TIMEOUT_MS,
});Optional test hardening: in tests/unit/utils/fetch-config.test.ts, extend the ProxyAgent Timeout Configuration assertion to include requestTls: expect.objectContaining({ timeout: CONNECT_TIMEOUT_MS }).
One more housekeeping item: I previously saved a learning stating “requestTls is now always constructed with timeout”. Given the current code, that learning is premature. If you confirm, I’ll remove that learning entry to avoid confusion in future reviews. ✏️ Learnings added
🧠 Learnings used |
|
|
|
Tip For best results, initiate chat on the files or code changes.
⛔ Learnings removed🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
tests/unit/handlers.test.ts (1)
704-731:⚠️ Potential issue | 🟡 MinorLock in that disconnected
manage_contextstays off the bootstrap path.This still passes if the handler regresses to the normal bootstrap flow while returning the same local payload. Clear the bootstrap mocks before the call and assert
getClient()/ensureIntrospected()remain untouched.Suggested diff
// manage_context should NOT be blocked by health gate mockRegistryManager.executeTool.mockResolvedValue({ context: 'info' }); mockHealthMonitor.reportSuccess.mockClear(); mockHealthMonitor.reportError.mockClear(); + mockConnectionManager.getClient.mockClear(); + mockConnectionManager.ensureIntrospected.mockClear(); const result = await callToolHandler({ params: { name: 'manage_context', @@ expect(parsed.context).toBe('info'); // Fast-path does no GitLab I/O — must not mutate connection state expect(mockHealthMonitor.reportSuccess).not.toHaveBeenCalled(); expect(mockHealthMonitor.reportError).not.toHaveBeenCalled(); + expect(mockConnectionManager.getClient).not.toHaveBeenCalled(); + expect(mockConnectionManager.ensureIntrospected).not.toHaveBeenCalled();Based on learnings, the unreachable
manage_contextbranch is intended to operate on local state only and never call the GitLab API.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/handlers.test.ts` around lines 704 - 731, Before invoking callToolHandler for the disconnected manage_context case, clear the bootstrap-related mocks (e.g., mockGitlabBootstrap.getClient.mockClear() and mockGitlabBootstrap.ensureIntrospected.mockClear()) and after the call assert those methods were not called (expect(mockGitlabBootstrap.getClient).not.toHaveBeenCalled(); expect(mockGitlabBootstrap.ensureIntrospected).not.toHaveBeenCalled()). This ensures the manage_context branch uses only local state and never touches the bootstrap/GitLab API (referencing callToolHandler, getClient(), and ensureIntrospected()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/handlers.test.ts`:
- Around line 774-792: Tighten the test's assertion to ensure the state-change
callback only calls the URL-scoped refresh path: after invoking the captured
stateChangeCallback (from mockHealthMonitor.onStateChange) clear prior
invocations, invoke the callback, then assert mockRegistryManager.refreshCache
was called with the exact URL and that mockRegistryManager.refreshCache was not
called with zero arguments (i.e., ensure no unscoped refreshCache() call), so
the test verifies per-instance refresh behavior added by
setupHandlers/stateChangeCallback.
- Around line 1604-1621: The test currently asserts only the returned
CONNECTION_FAILED payload but must also verify that non-OAuth introspection
failures are reported to HealthMonitor; update the test that mocks
mockConnectionManager.ensureIntrospected to reject so that after calling
callToolHandler you also assert healthMonitor.reportError was invoked (e.g.,
toHaveBeenCalled or toHaveBeenCalledWith an error/metadata); reference the
existing mocks (mockConnectionManager.ensureIntrospected and callToolHandler)
and the test's healthMonitor mock (healthMonitor.reportError) and add a short
expectation that it was called once with an error-like argument.
---
Duplicate comments:
In `@tests/unit/handlers.test.ts`:
- Around line 704-731: Before invoking callToolHandler for the disconnected
manage_context case, clear the bootstrap-related mocks (e.g.,
mockGitlabBootstrap.getClient.mockClear() and
mockGitlabBootstrap.ensureIntrospected.mockClear()) and after the call assert
those methods were not called
(expect(mockGitlabBootstrap.getClient).not.toHaveBeenCalled();
expect(mockGitlabBootstrap.ensureIntrospected).not.toHaveBeenCalled()). This
ensures the manage_context branch uses only local state and never touches the
bootstrap/GitLab API (referencing callToolHandler, getClient(), and
ensureIntrospected()).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3ed69cb9-b610-4301-baed-750659f4b31a
📒 Files selected for processing (2)
src/handlers.tstests/unit/handlers.test.ts
| it('should invoke state change callback for tool list updates', async () => { | ||
| // Capture the callback registered via onStateChange | ||
| const stateChangeCallback = mockHealthMonitor.onStateChange.mock.calls[0]?.[0]; | ||
| expect(stateChangeCallback).toBeDefined(); | ||
|
|
||
| // Clear refreshCache call count from setup | ||
| mockRegistryManager.refreshCache.mockClear(); | ||
|
|
||
| // Simulate disconnected → healthy transition (should trigger broadcast) | ||
| if (stateChangeCallback) { | ||
| stateChangeCallback('https://gitlab.example.com', 'disconnected', 'healthy'); | ||
| // Flush microtasks + dynamic import resolution for fire-and-forget callback | ||
| await new Promise((resolve) => process.nextTick(resolve)); | ||
| await new Promise((resolve) => process.nextTick(resolve)); | ||
| } | ||
|
|
||
| // refreshCache should have been called with the instance URL by the state change callback | ||
| expect(mockRegistryManager.refreshCache).toHaveBeenCalledWith('https://gitlab.example.com'); | ||
| }); |
There was a problem hiding this comment.
Make the state-change callback assertion strict about URL-scoped cache refresh.
toHaveBeenCalledWith(url) still passes if the callback also issues an unscoped refreshCache(). That would blur the new per-instance behavior this PR is adding.
Suggested diff
// refreshCache should have been called with the instance URL by the state change callback
+ expect(mockRegistryManager.refreshCache).toHaveBeenCalledTimes(1);
expect(mockRegistryManager.refreshCache).toHaveBeenCalledWith('https://gitlab.example.com');Based on learnings, setupHandlers() uses zero-arg refreshCache() only during initial startup, while the state-change callback is the URL-scoped refresh path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/handlers.test.ts` around lines 774 - 792, Tighten the test's
assertion to ensure the state-change callback only calls the URL-scoped refresh
path: after invoking the captured stateChangeCallback (from
mockHealthMonitor.onStateChange) clear prior invocations, invoke the callback,
then assert mockRegistryManager.refreshCache was called with the exact URL and
that mockRegistryManager.refreshCache was not called with zero arguments (i.e.,
ensure no unscoped refreshCache() call), so the test verifies per-instance
refresh behavior added by setupHandlers/stateChangeCallback.
| it('should return CONNECTION_FAILED when introspection fails before bootstrap completes', async () => { | ||
| // isConnected returns true (already connected), but ensureIntrospected throws | ||
| // Since bootstrapComplete is still false, this goes to CONNECTION_FAILED path | ||
| mockConnectionManager.ensureIntrospected.mockRejectedValueOnce( | ||
| new Error('Introspection failed'), | ||
| ); | ||
|
|
||
| const result = await callToolHandler({ | ||
| params: { | ||
| name: 'test_tool', | ||
| arguments: {}, | ||
| }, | ||
| }); | ||
|
|
||
| expect(result.isError).toBe(true); | ||
| const parsed = JSON.parse(result.content![0].text); | ||
| expect(parsed.error_code).toBe('CONNECTION_FAILED'); | ||
| }); |
There was a problem hiding this comment.
Also assert default-URL introspection failures are reported to HealthMonitor.
This only checks the returned CONNECTION_FAILED payload. It still passes if the non-OAuth ensureIntrospected() failure stops updating health state.
Suggested diff
expect(result.isError).toBe(true);
const parsed = JSON.parse(result.content![0].text);
expect(parsed.error_code).toBe('CONNECTION_FAILED');
+ expect(mockHealthMonitor.reportError).toHaveBeenCalledWith(
+ 'https://gitlab.example.com',
+ expect.any(Error),
+ );Based on learnings, the bootstrap catch path is expected to both return CONNECTION_FAILED and call healthMonitor.reportError(...) for real bootstrap failures before timeout.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/handlers.test.ts` around lines 1604 - 1621, The test currently
asserts only the returned CONNECTION_FAILED payload but must also verify that
non-OAuth introspection failures are reported to HealthMonitor; update the test
that mocks mockConnectionManager.ensureIntrospected to reject so that after
calling callToolHandler you also assert healthMonitor.reportError was invoked
(e.g., toHaveBeenCalled or toHaveBeenCalledWith an error/metadata); reference
the existing mocks (mockConnectionManager.ensureIntrospected and
callToolHandler) and the test's healthMonitor mock (healthMonitor.reportError)
and add a short expectation that it was called once with an error-like argument.
| const parsedBaseDelay = parseInt(process.env.GITLAB_API_RETRY_BASE_DELAY_MS ?? '1000', 10); | ||
| export const API_RETRY_BASE_DELAY_MS = | ||
| Number.isFinite(parsedBaseDelay) && parsedBaseDelay > 0 ? parsedBaseDelay : 1000; | ||
| export const API_RETRY_MAX_ATTEMPTS = parseStrictInt(process.env.GITLAB_API_RETRY_MAX_ATTEMPTS, 3); |
There was a problem hiding this comment.
API_RETRY_MAX_ATTEMPTS is now parsed via parseStrictInt(), which rejects 0 because it enforces > 0. enhancedFetch() treats maxRetries=0 as a valid “no retries” configuration (single attempt), and the previous parsing logic allowed 0 (>= 0). Consider allowing 0 here (e.g., a dedicated non-negative int parser or a min parameter) so users can disable retries without toggling GITLAB_API_RETRY_ENABLED.


Summary
manage_context-only tool exposureKey Changes
Core:
src/services/HealthMonitor.ts(new)quickHealthCheck()— unauthenticated HEAD probe with skipAuth + rateLimit bypass, status < 500 = reachableInitializationTimeoutErrorclass for startup init timeout detection (used in performConnect)Connection:
src/services/ConnectionManager.tsInstanceStateviaMap<string, InstanceState>normalizeInstanceUrl()applied to all map keysHandlers:
src/handlers.tsisInstanceReachable()for request-scoped URLsbootstrapCompleteset after init + introspection succeed, before cache rebuildtimedOutguardmanage_contextfast-path — some actions (e.g. whoami) may attempt API calls that fail gracefullyUtils
normalizeInstanceUrl()— strip API suffixes, default ports, canonicalize origins via URL constructorparseTimerMs()— clamp all timer env vars to MAX_SAFE_TIMEOUT_MSskipAuthoption onenhancedFetch— suppress auth headers + cookies case-insensitivelyproxyTls.timeoutfor per-connection connect budget (not one-shot AbortSignal)classifyError()— cause.code support, HTTP 408, narrowed timeout heuristiccomputeDaysUntilExpiry()— calendar-valid date roundtrip checkRegistry:
src/registry-manager.ts+src/services/InstanceRegistry.tsnormalizeInstanceUrl(eliminates port divergence)aggregateFilterCounters()with reason→counter mappostProcessRelatedReferences()collapsed to single loopgetFilterStats(instanceUrl?)— treats connecting as reachable (consistent with isUnreachableMode)Config & Docs
GITLAB_TOOL_TIMEOUT_MS(default 120s) documented in READMETesting
bootstrap timeout with clearInflight, timedOut guard (resolve + reject paths), manage_context health skip,
connection tracker requestId forwarding
as anyRelated
Design Notes
OAuth fallback to GITLAB_BASE_URL: When OAuth is enabled but no per-request context is available (startup health checks, non-OAuth transports), the handler falls back to
GITLAB_BASE_URLinstead of returning an auth error. This is intentional — blocking health monitoring and tool-list initialization on missing context would degrade the startup experience. The fallback is safe becauseGITLAB_BASE_URLis always configured. Full URL threading is tracked in #379.Dependencies
Closes #364