Problem
`_dup_count_cache` in `src/api/admin/orgs.py` is a module-level Python dict. Under a multi-worker gunicorn deployment, each worker process holds its own copy. A merge or dismiss in worker A calls `_invalidate_dup_count_cache()` on A's cache only; workers B–N continue serving the stale count for up to 5 minutes (the TTL).
Currently not a bug in production (single uvicorn worker), but will silently misbehave if workers are scaled up.
Options
| Option |
Effort |
Notes |
| Redis |
Medium |
Add `redis-py` dep + `REDIS_URL` env var; replace dict with `GET`/`SETEX`/`DEL` |
| Single-worker ops constraint |
Zero |
Document `--workers 1` in deployment runbook; simplest |
| Accept TTL lag |
Zero |
Count is at most 5 min stale per worker; acceptable for low-traffic admin |
Recommended approach
At current scale: document the `--workers 1` constraint explicitly in the deployment runbook. Revisit with Redis if worker count is ever increased.
References
- `src/api/admin/orgs.py` — `_dup_count_cache`, `_invalidate_dup_count_cache()`, `count_org_duplicates()`
- AGENTS.md — cache caveat documented
- CR finding 3, conversation 2026-03-21
Problem
`_dup_count_cache` in `src/api/admin/orgs.py` is a module-level Python dict. Under a multi-worker gunicorn deployment, each worker process holds its own copy. A merge or dismiss in worker A calls `_invalidate_dup_count_cache()` on A's cache only; workers B–N continue serving the stale count for up to 5 minutes (the TTL).
Currently not a bug in production (single uvicorn worker), but will silently misbehave if workers are scaled up.
Options
Recommended approach
At current scale: document the `--workers 1` constraint explicitly in the deployment runbook. Revisit with Redis if worker count is ever increased.
References