Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/results/execution_sensitivity_llm/turnover_control.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
scenario,turnover_bin,bin_agents,bin_size,mean_turnover,within_bin_tau_e0_e1,full_leaderboard_tau
calm,T1,mean-reversion;signal-weighted;deepseek:deepseek-v4-pro;poe:gemini-3.1-pro,4,10.5,1.0,0.758
calm,T2,naive-momentum;poe:gpt-5.5;random;poe:glm-5,4,18.73,0.667,0.758
calm,T3,poe:claude-opus-4.7;buy-and-hold;minimum-variance;risk-parity,4,20.25,0.333,0.758
high_vol,T1,mean-reversion;signal-weighted;naive-momentum;deepseek:deepseek-v4-pro,4,11.53,0.667,0.242
high_vol,T2,poe:glm-5;poe:gemini-3.1-pro;poe:gpt-5.5;poe:claude-opus-4.7,4,17.48,0.667,0.242
high_vol,T3,random;minimum-variance;buy-and-hold;risk-parity,4,19.52,0.667,0.242
jump_tail,T1,mean-reversion;signal-weighted;naive-momentum;deepseek:deepseek-v4-pro,4,11.43,0.667,0.455
jump_tail,T2,poe:gemini-3.1-pro;poe:glm-5;random;poe:claude-opus-4.7,4,17.85,0.0,0.455
jump_tail,T3,poe:gpt-5.5;minimum-variance;buy-and-hold;risk-parity,4,19.75,1.0,0.455
20 changes: 20 additions & 0 deletions docs/results/execution_sensitivity_llm/turnover_control.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Turnover-Controlled Ranking Stability

If the E0->E1 reordering were purely a turnover effect, agents of
similar turnover would not reorder. Within turnover terciles (binned by
E1 turnover), the E0-vs-E1 Kendall tau remains low, so the reordering is
not explained by turnover alone.

| Regime | Turnover bin | Mean turnover | Within-bin tau (E0 vs E1) | Full-leaderboard tau |
| --- | --- | ---: | ---: | ---: |
| calm | T1 | 10.5 | 1.0 | 0.758 |
| calm | T2 | 18.73 | 0.667 | 0.758 |
| calm | T3 | 20.25 | 0.333 | 0.758 |
| high_vol | T1 | 11.53 | 0.667 | 0.242 |
| high_vol | T2 | 17.48 | 0.667 | 0.242 |
| high_vol | T3 | 19.52 | 0.667 | 0.242 |
| jump_tail | T1 | 11.43 | 0.667 | 0.455 |
| jump_tail | T2 | 17.85 | 0.0 | 0.455 |
| jump_tail | T3 | 19.75 | 1.0 | 0.455 |

Mean within-bin tau across all regimes and terciles: **0.630** (vs.\ a full-leaderboard tau that is similarly low). The reordering persists within turnover strata.