L4472: split Saturday Backtester SF state into 3 (simulate / predictor / optimizer)#351
Merged
Merged
Conversation
…r / optimizer) The single `Backtester` state ran simulate+param_sweep+predictor-backtest+Phase4 +optimizer/cov/gamma in ONE SSM command whose SUMMED runtime exceeded the SSM execution timeout on a fresh date (L4470) -> no backtester/parity/evaluator completion -> config auto-apply froze (L4473). Each phase re-runs the predictor pipeline (GBM inference + 10y ArcticDB read), so it was death-by-sum-of-runtime with no single per-phase cap tripped. Decompose the backtest stage by --mode into THREE sequential SF states, each with its own SSM execution timeout + independent redrive (mirrors the 2026-05-07/05-16 simulate/parity/evaluator split precedent): Backtester --mode=param-sweep --no-pit-parity (simulate+sweep) PredictorBacktest --mode=predictor-backtest (predictor+Phase4; pit_parity HERE) PortfolioOptimizerBacktest --mode=portfolio-optimizer-backtest --no-pit-parity Happy path: CheckSkipBacktester -> Backtester -> PredictorBacktest -> PortfolioOptimizerBacktest -> CheckSkipParity -> Parity -> CheckSkipEvaluator -> Evaluator. skip_backtester still skips the whole backtest-family (routes past CheckSkipParity to CheckSkipEvaluator). pit_parity fires exactly once (default-ON in PredictorBacktest; the other two pass --no-pit-parity). Each new state mirrors Backtester's Retry/Catch(HandleFailure)/Timeout posture with distinct ResultPaths; on failure the next state is NOT entered (anti-auto-promote-garbage). Tests: updated test_sf_backtest_parity_split_wiring (now pins the 3-way split + new TestL4472PhaseSplit class), test_sf_eval_judge_wiring (walk the new chain), test_sf_payload_uniqueness (spot-state count 8->10), test_sf_friday_shell_run_wiring (_SPOT_STATES registry + main-thread trace); regenerated the sf_prekeystone_spot_commands.json baseline for the deliberate spot-command change. Full SF suite + full data suite green (1731 passed). Pairs with alpha-engine-backtester executor-apply mode-gate fix (the correctness piece) + alpha-engine-config assembler cutover. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context — ROADMAP L4472 (⏰ soft deadline: 2026-06-06 09:00 UTC Saturday SF)
The single Saturday
Backtesterstate ran simulate+param_sweep+predictor-backtest+Phase4+optimizer/cov/gamma in one SSM command whose summed runtime exceeded the SSM execution timeout on a fresh date (L4470) → no backtester/parity/evaluator completion → config auto-apply froze (L4473). Each post-sweep phase re-runs the predictor pipeline (GBM inference + 10y ArcticDB read), so it was death-by-sum-of-runtime with no single per-phase cap tripped. The 06-06 run repeats this failure unless this lands first.The split
Decompose the backtest stage by
--modeinto three sequential SF states, each with its own SSM execution timeout + independent redrive (mirrors the 2026-05-07/05-16 simulate/parity/evaluator split precedent):Backtester--mode=param-sweep --no-pit-parityPredictorBacktest(new)--mode=predictor-backtestPortfolioOptimizerBacktest(new)--mode=portfolio-optimizer-backtest --no-pit-parityHappy path:
CheckSkipBacktester → Backtester → PredictorBacktest → PortfolioOptimizerBacktest → CheckSkipParity → Parity → CheckSkipEvaluator → Evaluator.skip_backtesterstill skips the whole backtest-family (routes pastCheckSkipParitytoCheckSkipEvaluator) — unchanged.pit_parityfires exactly once (default-ON inPredictorBacktest; the other two pass--no-pit-parity).Backtester's Retry / Catch(HandleFailure) / Timeout posture, distinct ResultPaths; on failure the next state is not entered (anti-auto-promote-garbage). Decouples Backtester→Parity sopit_parity.jsonfinally lands.spot_backtest.shalready threads--mode/--no-pit-parity/--skip-stages.Safety verified before wiring
--modeis self-contained (re-reads from S3/ArcticDB; no in-memory handoff between phases) → 3 sequential processes == 1--mode=allprocess for artifacts.mode=="all").Testing
test_sf_backtest_parity_split_wiring(3-way split + newTestL4472PhaseSplit),test_sf_eval_judge_wiring,test_sf_payload_uniqueness(spot-state count 8→10),test_sf_friday_shell_run_wiring(_SPOT_STATES+ main-thread trace).tests/fixtures/sf_prekeystone_spot_commands.json(documented workflow for a deliberate spot-command change).Deploy
SF deploy is a manual operator step (
infrastructure/deploy_step_function.sh→aws stepfunctions update-state-machine). Must be deployed before the 06-06 09:00 UTC Saturday cron.Companion PRs
alpha-engine-backtester#269: executor-apply mode-gate + simulation cap (correctness).alpha-engine-config: enableassembler.cutover_enabled+ ROADMAP follow-ups.🤖 Generated with Claude Code