feat(backtester): spot_backtest.sh --preflight-only (Friday shell-run dry path, smoke-only, no auto-apply)#224
Merged
Conversation
… dry path, smoke-only, no auto-apply) Owed-item #3 of ROADMAP "Friday shell-run — per-module dry-path activation" (P1). Under the Friday shell_run, the Backtester / Parity / Evaluator spot state now boots the spot for real, installs deps, runs the EXISTING bootstrap-class smoke harness, then exits 0 — with NO param sweep, NO portfolio sim, NO parity, NO evaluator, NO config/*.json auto-apply, ZERO external API calls, ZERO S3/config writes — catching bootstrap-class breakage (lib-pin drift, sys.path collision, stale ArcticDB universe, missing predictor weights, SSM timeout, image gap) ~12h before the real Saturday Backtester. Flag name (verbatim, for the SF keystone follow-on): --preflight-only Identical to the data (spot_data_weekly.sh #259) and predictor (spot_train.sh #175) siblings — the Friday shell_run SF keystone dispatches the same flag name to every module. PREFLIGHT_ONLY is a MODIFIER (default 0, orthogonal to RUN_MODE), mirroring the data sibling, so an unflagged Saturday SF run is completely unaffected. Insertion point New `if [ "$PREFLIGHT_ONLY" = "1" ]` branch placed AFTER the real boot/clone/deps/config-upload + ENV_SOURCE/REMOTE_PYTHON resolution (so the bootstrap path is genuinely exercised) and STRICTLY BEFORE both the `if [ "$RUN_MODE" = "smoke-only" ]` block and the `# ── Full backtest` heredoc. It runs one run_remote heredoc (`backtest.py --mode=smoke`) then `exit 0`. Reuses the existing --mode=smoke? Yes. backtest.py:4180-4184 — `--mode=smoke` runs BacktesterPreflight (lib-version / imports / predictor-weights presence / executor-config validation, PRs #43-#48, 2026-04-22) + _runtime_smoke (universe symbols + per-ticker ArcticDB read + recent signals.json load + Layer-1A GBM load/predict — all S3 *reads*, ~30-60s) then `return`s BEFORE _init_pipeline / the simulation / the optimizer. So --mode=smoke itself performs zero config writes and makes no external API (yfinance / Anthropic) data fetch. No parallel preflight was built. No-sweep / no-parity / no-auto-apply proof (statically unreachable under the flag): * The per-phase smoke loop (smoke-simulate / smoke-param-sweep / smoke-predictor-backtest / smoke-phase4 / smoke-predictor-param-sweep) AND the `evaluate.py --mode diagnostics` S3-probe block both live inside the `if [ "$RUN_MODE" = "smoke-only" ]` body — past the `exit 0`. * The full-backtest heredoc (backtest stage → pit_parity → parity → evaluator) and its config/{executor,scoring,predictor,research, scanner}_params*.json optimizer auto-apply (`evaluate.py --upload`, non-frozen) live further below — also past the `exit 0`. * The CloudWatch heartbeats, parity_report.json / parity_metrics.csv upload, and reporter S3 upload are all past the `exit 0`. The preflight heredoc body references ONLY `backtest.py --mode=smoke` (no --upload, no full mode, no evaluate.py, no aws s3 cp/sync, no put-metric-data). The `trap cleanup EXIT` still fires and terminates the spot. Tests: new tests/test_spot_backtest_preflight_only.py (7 tests, static-analysis style mirroring test_spot_backtest_aws_region.py): flag parses, MODIFIER default 0, dedicated branch exits 0, branch precedes the smoke-only block + full-backtest heredoc, body invokes ONLY --mode=smoke with zero sweep/sim/parity/evaluator/--upload/ auto-apply tokens, ENV_SOURCE preserved (#247 AWS_REGION guard). Full suite: 1665 passed, 5 skipped, 1 deselected. bash -n clean. No new deps, no secrets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Owed-item #3 of ROADMAP "Friday shell-run — per-module dry-path activation" (P1).
Under the Friday
shell_run, the Backtester / Parity / Evaluator spot state now boots the spot for real, installs deps, runs the existing bootstrap-class smoke harness, thenexit 0— with NO param sweep, NO portfolio sim, NO parity, NO evaluator, NOconfig/*.jsonauto-apply, ZERO external API calls, ZERO S3/config writes — catching bootstrap-class breakage (lib-pin drift, sys.path collision, stale ArcticDB universe, missing predictor weights, SSM timeout, image gap) ~12h before the real Saturday Backtester.Flag name (verbatim — SF keystone follow-on)
--preflight-only— identical to the data (spot_data_weekly.sh#259) and predictor (spot_train.sh#175) siblings.PREFLIGHT_ONLYis a MODIFIER (default0, orthogonal toRUN_MODE), mirroring the data sibling, so an unflagged Saturday SF run is completely unaffected.Insertion point
New
if [ "$PREFLIGHT_ONLY" = "1" ]branch placed AFTER the real boot/clone/deps/config-upload +ENV_SOURCE/REMOTE_PYTHONresolution (so the bootstrap path is genuinely exercised) and STRICTLY BEFORE both theif [ "$RUN_MODE" = "smoke-only" ]block and the# ── Full backtestheredoc. It runs onerun_remoteheredoc (backtest.py --mode=smoke) thenexit 0.Reuses the existing
--mode=smoke?Yes. Per
backtest.py:4180-4184,--mode=smokerunsBacktesterPreflight(lib-version / imports / predictor-weights presence / executor-config validation — PRs #43-#48, 2026-04-22) +_runtime_smoke(universe symbols + per-ticker ArcticDB read + recent signals.json load + Layer-1A GBM load/predict — all S3 reads, ~30-60s), thenreturns BEFORE_init_pipeline/ the simulation / the optimizer. So--mode=smokeitself performs zero config writes and makes no external API (yfinance/Anthropic) data fetch. No parallel preflight was built.No-sweep / no-parity / no-auto-apply proof (statically unreachable under the flag)
smoke-simulate/smoke-param-sweep/smoke-predictor-backtest/smoke-phase4/smoke-predictor-param-sweep) AND theevaluate.py --mode diagnosticsS3-probe block both live inside theif [ "$RUN_MODE" = "smoke-only" ]body — past theexit 0.config/{executor,scoring,predictor,research,scanner}_params*.jsonoptimizer auto-apply (evaluate.py --upload, non-frozen) live further below — also past theexit 0.parity_report.json/parity_metrics.csvupload, and reporter S3 upload are all past theexit 0.backtest.py --mode=smoke(no--upload, no full mode, noevaluate.py, noaws s3 cp/sync, noput-metric-data).The
trap cleanup EXITstill fires and terminates the spot.Tests
New
tests/test_spot_backtest_preflight_only.py(7 tests, static-analysis style mirroringtest_spot_backtest_aws_region.py): flag parses, MODIFIER default0, dedicated branch exits 0, branch precedes the smoke-only block + full-backtest heredoc, body invokes ONLY--mode=smokewith zero sweep/sim/parity/evaluator/--upload/auto-apply tokens,ENV_SOURCEpreserved (#247 AWS_REGION guard).bash -nclean🤖 Generated with Claude Code