fix(auth): don't cache transient org-membership check failures#604
fix(auth): don't cache transient org-membership check failures#604norrietaylor wants to merge 1 commit into
Conversation
A transient GitHub outage (network error or unexpected 5xx) made the org-membership check fail closed for the request — correct — but the resulting False was cached for the full TTL (default 1 hour). A legitimate org member would then be locked out for up to an hour even after GitHub recovered, since every subsequent request returned the cached denial. Distinguish a definitive non-member (GitHub 404) from an indeterminate result via a private _IndeterminateMembershipError raised on transient errors. _check_org catches it, fails closed for the current request, and skips caching so the next request re-hits the API. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Too many files changed? Review this PR in Change Stack to see how the pieces fit before you dive in. No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR modifies membership-checking logic to prevent transient/indeterminate failures from being cached. A new ChangesTransient membership failure caching prevention
Sequence DiagramsequenceDiagram
participant Client
participant _check_org
participant _fetch_membership
participant MembershipCache
Client->>_check_org: check_membership(org, user)
_check_org->>_fetch_membership: fetch membership
_fetch_membership-->>_check_org: _IndeterminateMembershipError (503/ConnectError)
_check_org-->>Client: False (fail-closed)
Note over MembershipCache: No cache write
Client->>_check_org: check_membership(org, user)
_check_org->>_fetch_membership: fetch membership (cache miss)
_fetch_membership-->>_check_org: True/False (successful result)
_check_org->>MembershipCache: cache successful result
_check_org-->>Client: True/False
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
Root cause
OrgMembershipChecker._check_org(src/distillery/mcp/org_membership.py) caches the result of every membership lookup keyed by(username, org)for the configured TTL (default 1 hour). The underlying_fetch_membership/_fetch_public_memberhelpers fail closed — they returnFalsenot only for a definitive GitHub404(genuinely not a member) but also for transient conditions: a network error (httpx.HTTPError) or an unexpected status such as5xx/ rate-limit.Because
_check_orgcached thatFalseunconditionally, a momentary GitHub outage during a single request would lock a legitimate org member out for the full TTL — every subsequent request returned the cached denial even after GitHub recovered. Failing closed for the current request is correct; persisting that denial is not.Fix
Introduce a private
_IndeterminateMembershipErrorraised on the transient/unknown paths in_fetch_membershipand_fetch_public_member(network error and unexpected status), distinct from the definitive404 -> False._check_orgcatches it, returnsFalsefor the current request (still fail-closed), and skips the cache write so the next request re-hits the GitHub API. Definitive results (204 -> True,404 -> False) are cached exactly as before.Surgical change confined to
org_membership.py; no public API or signature changes.Acceptance
tests/test_mcp_org_membership.py:test_transient_error_is_not_cached— network error then recovery: first request denies (fail-closed), second request admits the member and re-hits the API.test_unexpected_status_is_not_cached— same for a503from the members endpoint.main(cachedFalse→ second request returnsFalse, API hit once) and pass with the fix.test_negative_result_is_cached(real404still cached),test_network_error_fails_closed,test_unexpected_status_returns_false, and the full 45-testtest_mcp_org_membership.pysuite pass.test_middleware.py,test_mcp_auth.py,test_mcp_http_transport.pyall pass (130 total).ruff checkandmypy --strictclean on touched files.🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests