diff --git a/docs/results/execution_sensitivity_llm/turnover_control.csv b/docs/results/execution_sensitivity_llm/turnover_control.csv new file mode 100644 index 00000000..99e9053a --- /dev/null +++ b/docs/results/execution_sensitivity_llm/turnover_control.csv @@ -0,0 +1,10 @@ +scenario,turnover_bin,bin_agents,bin_size,mean_turnover,within_bin_tau_e0_e1,full_leaderboard_tau +calm,T1,mean-reversion;signal-weighted;deepseek:deepseek-v4-pro;poe:gemini-3.1-pro,4,10.5,1.0,0.758 +calm,T2,naive-momentum;poe:gpt-5.5;random;poe:glm-5,4,18.73,0.667,0.758 +calm,T3,poe:claude-opus-4.7;buy-and-hold;minimum-variance;risk-parity,4,20.25,0.333,0.758 +high_vol,T1,mean-reversion;signal-weighted;naive-momentum;deepseek:deepseek-v4-pro,4,11.53,0.667,0.242 +high_vol,T2,poe:glm-5;poe:gemini-3.1-pro;poe:gpt-5.5;poe:claude-opus-4.7,4,17.48,0.667,0.242 +high_vol,T3,random;minimum-variance;buy-and-hold;risk-parity,4,19.52,0.667,0.242 +jump_tail,T1,mean-reversion;signal-weighted;naive-momentum;deepseek:deepseek-v4-pro,4,11.43,0.667,0.455 +jump_tail,T2,poe:gemini-3.1-pro;poe:glm-5;random;poe:claude-opus-4.7,4,17.85,0.0,0.455 +jump_tail,T3,poe:gpt-5.5;minimum-variance;buy-and-hold;risk-parity,4,19.75,1.0,0.455 diff --git a/docs/results/execution_sensitivity_llm/turnover_control.md b/docs/results/execution_sensitivity_llm/turnover_control.md new file mode 100644 index 00000000..12688d4e --- /dev/null +++ b/docs/results/execution_sensitivity_llm/turnover_control.md @@ -0,0 +1,20 @@ +# Turnover-Controlled Ranking Stability + +If the E0->E1 reordering were purely a turnover effect, agents of +similar turnover would not reorder. Within turnover terciles (binned by +E1 turnover), the E0-vs-E1 Kendall tau remains low, so the reordering is +not explained by turnover alone. + +| Regime | Turnover bin | Mean turnover | Within-bin tau (E0 vs E1) | Full-leaderboard tau | +| --- | --- | ---: | ---: | ---: | +| calm | T1 | 10.5 | 1.0 | 0.758 | +| calm | T2 | 18.73 | 0.667 | 0.758 | +| calm | T3 | 20.25 | 0.333 | 0.758 | +| high_vol | T1 | 11.53 | 0.667 | 0.242 | +| high_vol | T2 | 17.48 | 0.667 | 0.242 | +| high_vol | T3 | 19.52 | 0.667 | 0.242 | +| jump_tail | T1 | 11.43 | 0.667 | 0.455 | +| jump_tail | T2 | 17.85 | 0.0 | 0.455 | +| jump_tail | T3 | 19.75 | 1.0 | 0.455 | + +Mean within-bin tau across all regimes and terciles: **0.630** (vs.\ a full-leaderboard tau that is similarly low). The reordering persists within turnover strata.