feat(tech-ablation): two-flag auto-apply path with 4-week reproduction gate (ROADMAP L2553) by cipher813 · Pull Request #229 · cipher813/alpha-engine-backtester

cipher813 · 2026-05-19T23:22:35Z

ROADMAP: L2553 — Tech weight ablation auto-apply — parallel-observation cutover (P1)

Cutover follow-on to alpha-engine-backtester#174 (tech weight ablation recommendation-only). The compute step is unchanged; this PR wires the gated write path that L2553 specifies.

Activation contract

Both flags default false → bit-identical behavior to today.

Flag (under `tech_weight_ablation:` in backtester config.yaml)	What fires	Live config
Both false	nothing	unchanged
`use_tech_ablation_target=True` (Stage 1)	shadow archive written every "ok" week to `config/scoring_weights_per_sector_shadow_history/{run_id}.json` + `latest.json` sidecar	unchanged
`use_tech_ablation_target=True` AND `enforce_tech_ablation=True` (Stage 2)	shadow archive + live write to `config/scoring_weights_per_sector.json` iff reproduction gate passes	written when same per_sector payload reproduces across last 4 shadow archives

Reproduction gate

_MIN_CONSECUTIVE_WEEKS = 4. Byte-equal per_sector match across the last 4 shadow archives required (the L2553 "4+ consecutive Saturdays" acceptance). One drift week breaks the streak — the gate explicitly does NOT tolerate intermittent shadow drift, matching the "recommendation must reproduce" framing.

Implementation

optimizer/tech_weight_ablation.py:
- init_config() + module-level _cfg (mirrors executor_optimizer).
- _build_per_sector_payload() — recommendations: {team_id -> config_name} → {team_id -> {weight_name -> value}} via DEFAULT_GRID lookup. Unknown config_name drops cleanly (forward-compat guard).
- _read_recent_shadow_archives() — lex-sorts YYMMDDHHMM keys descending; missing/corrupt archives skip with a WARNING (treated as "reproduction not yet reached", not a hard fail).
- _check_reproduction_gate() — returns {passed, reason, n_consecutive}.
- apply() — orchestrates the two stages.
evaluate.py:
- tech_weight_ablation.init_config(config) added alongside the other optimizers.
- New _run_tech_weight_ablation() helper hangs compute + apply together (with --freeze short-circuit), replacing the prior inline lambda — mirrors _run_executor_opt / _run_weight_opt.

Tests (+12 new, suite 1694 → 1706 green)

TestBuildPerSectorPayload × 3 — name→weights mapping including the unknown-name drop + empty-recommendations.
TestApplyShadowGating × 4 — flag-off skips, status-not-ok skips, empty recommendations skips, shadow-only mode writes archive but NOT live.
TestReproductionGate × 3 — insufficient history fails, 4-in-a-row matches pass, 1 drift breaks the streak.
TestApplyLiveGating × 3 — enforce+insufficient history → shadow only; enforce+full reproduction → live written; enforce+drift in history → blocked + live untouched.
One pre-existing test (test_recommendation_only_no_apply) reframed: the old apply_note string was the artifact this PR closes; new note points readers at apply() + the two flags.

S3 contract

Both keys are new additions — config/scoring_weights_per_sector.json and config/scoring_weights_per_sector_shadow_history/. No consumer reads them today, so this PR is forward-compatible on its own.

The research-side consumer (per-sector composite_weights override layer on top of scoring.yaml) is operator-owned: alpha-engine-research/config.py is gitignored per repo policy, and the per-sector schema is already in scoring.yaml (see _load_live_composite_weights_per_sector in this same module). The override-read can land when Brian flips use_tech_ablation_target=True and decides to wire the consumer.

Until then, the writes are pure observation — exactly the L2553 design intent for the recommendation-stability soak.

Data gate (not a code gate)

The compute step still emits insufficient_data until ~30 sub-score-populated rows/team accumulate; apply() correctly no-ops on that path. First non-empty recommendation is expected ~2026-06-27 per the research v15-migration accumulation timeline.

Composes with

alpha-engine-backtester#174 (recommendation-only, this PR's prerequisite).
alpha-engine-research v15 sub-score persistence (data gate).
executor_optimizer.apply() (the pattern this PR mirrors).

🤖 Generated with Claude Code

…n gate (ROADMAP L2553) Wires the auto-apply cutover follow-on to alpha-engine-backtester#174 (tech weight ablation recommendation-only). The compute step is unchanged; this PR ships the gated write path. Activation - Both flags default false → bit-identical behavior to today. - `tech_weight_ablation.use_tech_ablation_target=True` (Stage 1): every "ok" compute result writes a shadow payload to `config/scoring_weights_per_sector_shadow_history/{run_id}.json` + `latest.json` sidecar. Live config untouched. Pure observability. - `tech_weight_ablation.enforce_tech_ablation=True` (Stage 2): live write to `config/scoring_weights_per_sector.json` fires ONLY when the reproduction gate passes — the same per-sector payload must reproduce across the last `_MIN_CONSECUTIVE_WEEKS = 4` shadow archives (the L2553 "4+ consecutive Saturdays" acceptance). One drift week breaks the streak; gate explicitly NOT tolerant of intermittent shadow drift. Implementation - `optimizer/tech_weight_ablation.py`: - `init_config()` + module-level `_cfg` (mirrors executor_optimizer). - `_build_per_sector_payload()` translates `recommendations: {team_id -> config_name}` to `{team_id -> {weight_name -> value}}` via DEFAULT_GRID lookup; unknown config_name drops cleanly. - `_read_recent_shadow_archives()` lex-sorts YYMMDDHHMM keys descending; missing/corrupt archives skip with a warning (treated as "reproduction not yet reached", not a hard fail). - `_check_reproduction_gate()` returns {passed, reason, n_consecutive} with byte-equal per_sector match across the window. - `apply()` orchestrates the two stages; mirrors `executor_optimizer.apply()` so the evaluator wiring stays uniform. - `evaluate.py`: - `tech_weight_ablation.init_config(config)` added alongside the other optimizers. - New `_run_tech_weight_ablation()` helper hangs compute + apply together (with `--freeze` short-circuit), replacing the prior inline lambda. Tests (+12 new, suite 1694 → 1706 green) - `TestBuildPerSectorPayload` × 3 — name→weights mapping incl. unknown-name fallthrough + empty-recommendations. - `TestApplyShadowGating` × 4 — flag-off skips, status-not-ok skips, empty recommendations skips, shadow-only mode writes archive but NOT live key. - `TestReproductionGate` × 3 — insufficient history (0 archives), exact 4-in-a-row match passes, 1 drift breaks the streak. - `TestApplyLiveGating` × 3 — enforce+insufficient history → shadow only, enforce+full reproduction → live written, enforce+drift in history → blocked + live untouched. - One pre-existing test (`test_recommendation_only_no_apply`) updated: the old `apply_note` string "recommendation-only — auto-apply gated on parallel observation cutover (follow-up PR)" was the artifact this PR closes; new note points readers at `apply()` + the two flags. S3 contract - Both keys are NEW additions: no consumer reads them today, so no backward-compat concern. The research-side consumer side (per-sector composite_weights override read) is operator-owned — `alpha-engine-research/config.py` is gitignored — and will land when Brian flips `use_tech_ablation_target` and decides to wire the override. Without that, the writes are observation-only. Data gate (not a code gate) - The compute step still emits `insufficient_data` until ~30 sub-score-populated rows/team accumulate; `apply()` correctly no-ops on that. First non-empty recommendation is expected ~2026-06-27 per the v15-migration accumulation timeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cipher813 merged commit 5238119 into main May 19, 2026
1 check passed

cipher813 deleted the feat/tech-weight-ablation-apply branch May 19, 2026 23:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tech-ablation): two-flag auto-apply path with 4-week reproduction gate (ROADMAP L2553)#229

feat(tech-ablation): two-flag auto-apply path with 4-week reproduction gate (ROADMAP L2553)#229
cipher813 merged 1 commit into
mainfrom
feat/tech-weight-ablation-apply

cipher813 commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cipher813 commented May 19, 2026

Activation contract

Reproduction gate

Implementation

Tests (+12 new, suite 1694 → 1706 green)

S3 contract

Data gate (not a code gate)

Composes with

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant