fix(CubeProxy): implement singleflight pattern for routing and health checks#135
fix(CubeProxy): implement singleflight pattern for routing and health checks#135novahe wants to merge 1 commit into
Conversation
…end checks - Abstracted singleflight logic into a reusable `utils:singleflight_do` method (Go-like interface). - Optimized faulty backend checks by switching to O(1) SISMEMBER and preventing cache stampedes. - Hardened sandbox metadata fetching in `get_backend_address` to prevent Redis thundering herd. - Introduced dedicated shared memory lock dictionaries for better cache isolation. Signed-off-by: novahe <heqianfly@gmail.com>
d5ecb25 to
f52eda9
Compare
|
Thanks for the PR. Singleflight for routing — I agree this is valuable, but have concerns about introducing locks in the data path. CubeProxy sits on every request's critical path. In practice, the existing refresh-on-hit already prevents stampedes for active sandboxes (the vast majority). The real gap is: 1) Cold start — all workers start with empty cache simultaneously; 2) Synchronized TTL — workers share the same random seed. math.randomseed — As analyzed above, this is a good catch👍
is_faulty_backend rework — Not caching the faulty state is intentional: we want to avoid a stale "faulty" flag in cache/Redis persistently rejecting requests after a backend recovers. The Removing refresh-on-hit — This overlaps with #133. Without live migration support I'd prefer to keep it for now. Suggestion: Extract the math.randomseed fix and Replace |
|
@staryxchen Thanks for your thorough review! I've taken your advice and started splitting this PR into separate, more manageable ones.
I'll close this PR once all parts are moved. Thanks! |
Problem
In high-concurrency environments, CubeProxy suffered from multiple race conditions and logic flaws that degraded performance and introduced "dog-pile" effects (cache stampedes):
is_faulty_backendcheck was effectively disabled as the cache was never populated. This resulted in either redundantSMEMBERSscans or missed detection.HGETALLorSISMEMBER).Proposed Changes
utils:singleflight_do, a cross-worker synchronization mechanism inspired by Go'ssync/singleflight. It usesresty.lockwith a Double-Checked Locking pattern andxpcallwith tracebacks for robust error handling.get_backend_address. RedisHGETALLqueries are now limited to at most one per key globally during cache misses.SISMEMBERchecks and fixed the missing cache population.local_cache_locksandfaulty_backend_locksinnginx.confto isolate lock states from data entries, preventing LRU eviction conflicts.init_worker_phase.luato ensure effective TTL jittering across all Nginx workers.