fix(scx_lavd): Fix combinatorial state explosion in energy model generation causing absurd memory usage#3548
fix(scx_lavd): Fix combinatorial state explosion in energy model generation causing absurd memory usage#3548BHLuotianyi wants to merge 2 commits into
Conversation
In the prior change that moved the CPU preference generation to a DP algorithm, `EnergyModelOptimizer::insert_best_pdsi()` and `insert_pds_combinations()` unconditionally preserved all identical `(performance, power)` states across identical or symmetric performance domains. This led to a massive combinatorial explosion of tracked `HashSet` states during startup, severely ballooning RSS memory to multiple gigabytes and hanging the startup process before the BPF scheduler could initialize. This fix aggressively prunes equivalent states. For any given `(performance, power)` pair, if the new combination yields the same power profile but uses fewer performance domains (`pd_id_set.len()`), it replaces the old state. If it uses more or equal domains, it is discarded. This strictly bounds the DP state tree per performance bucket to a single optimal representative that favors leakage power and locality, solving the memory explosion and cutting the startup time to under 0.1s. Resolves sched-ext#3340
|
@BHLuotianyi -- Thanks for trying LAVD. Could you share a tarball under |
Symptom: scx_lavd causes absurd memory usage and one core 100% before the RAM usage hits its ceiling. The system stutters a lot. The symptom is not observed when using any other scxes. According to #3340 , the more cores / threads the CPU has, the higher the mem usage is (observed ~30GB; ~5GB on my setup) According to AI, an exponential growth of RAM usage happens due to energy model generation. The more cores the system has, the more RAM the scx_lavd takes.
|
|
Thanks @BHLuotianyi for sharing the data. Could you share the processor model? If there is a machine that I can access, I will also try it on my side too. |
@multics69 If needed, I can provide my machine for test via VNC connection. |
Thanks for the extra info. I will try to take a deeper look and come up with another solution (if necessary) this weekend. |
|
Any updates on this? |
|
Sorry, @xirreal ! I didn't have time to work on this yet. Will find some time this week. |


DISCLAIMER: This PR needs a thorough review, as I know nothing about the code AI writes! But it tests fine on my Laptop.
I know AI may write trash, but this can at least provide an insight of the problem.
Regarding issue #3340
Description:
Problem
The original Energy Model (EM) initialization in scx_lavd used an exhaustive subset enumeration approach. On high-core-count systems, this triggered a$2^n$
complexity explosion, causing the scheduler to hang and consume excessive memory (RSS ballooning to several GiBs) during startup.
Solution
I have implemented a two-stage optimization to resolve this:
The initialization logic was refactored from subset enumeration to a Dynamic Programming (DP) approach. Instead of expanding all possible combinations, it now
directly considers CPU counts per performance domain and accumulates the lowest-power states for each performance level.
On symmetric systems (e.g., AMD Zen), many different CPU distributions yield the exact same (performance, power) metrics. Storing all these equivalent
permutations would still lead to a state explosion.
This fix implements strict pruning: for any given performance/power pair, the optimizer now retains only a single optimal representative. When multiple
combinations are equivalent, it prioritizes the one using the fewest performance domains (pd_id_set.len()). This ensures the state table remains small while
favoring configurations with better cache locality and reduced leakage power.
Results