W2 (L4469): residual/idiosyncratic-momentum L1 — OBSERVE-mode, not auto-promoted#216
Merged
Conversation
… — L4469 W2 of the predictor-improvement arc. Revives the dead raw-momentum L1 (WF median IC ~= -0.001) with residual/idiosyncratic momentum + vol scaling (Blitz/Hanauer) plus a transparent price-trend decomposition. This commit ships the self-contained pieces, no trainer wiring: - model/residual_momentum_scorer.py — deterministic transparent scorer (mirrors momentum_scorer.py; not a GBM — the repo already paid for that lesson). predict_array (train) + predict_dict (inference parity, dormant). - data/residual_momentum_features.py — residual-momentum feature construction from raw close + benchmark series the trainer already loads in-process. Net-new single-source RESEARCH features (no feature-store column recomputed) → "data module owns features" drift invariant preserved; feature-store home deferred until the signal validates (mirrors _FUNDAMENTAL_EXCLUDE). Strict backward-only / point-in-time construction (beta .shift(1), 12-1 skip-month). W2.3 factor momentum deferred. - config.py + predictor.sample.yaml + predictor.yaml — residual_momentum section; RESIDUAL_MOMENTUM_ENABLED defaults False (the observe gate). - tests: 18 passing — scorer contract/NaN/determinism/dict-parity + feature known-beta recovery, no-look-ahead, skip-month, front-of-history NaN. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e gate (W2) — L4469 Integrates the W2 residual-momentum L1 into meta_trainer.py, OBSERVE-mode: - Per-ticker residual-momentum features computed in the streaming loop from the raw close + benchmark series already loaded in-process; built into X_resid_mom through the same concat → sort → rank-norm pipeline as X_mom/X_vol. - Deterministic scorer evaluated per fold (resid_mom_fold_ics); standalone leak-free read = cross-sectional rank IC (Fama-MacBeth) + Sortino/DSR over the canonical-finite rows — the isolated promotion-gate number. Runs regardless of the flag; observe-only, wrapped try/except, never fails training. - OBSERVE GATE: build_train_meta_features(enabled) appends residual_momentum_score to the L2 feature list ONLY when cfg.RESIDUAL_MOMENTUM_ENABLED. Flag False ⇒ TRAIN_META_FEATURES == META_FEATURES ⇒ meta_X, persisted meta_model._feature_names, and the inference path are byte-identical. The META_FEATURES module constant is never mutated. - passes_wf is UNCHANGED (vol_median_ic > 0) — residual momentum enters NO promotion gate. Manifest gains residual_momentum_median_ic + residual_momentum_leakfree_oos_ic (additive per S3 contract). Tests: gate-invariant suite (byte-identical when false, +1 col when true, constant never mutated) + feature-list source-contract update. Full suite 1299 → 1326 passing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…VIEW — L4469 Adds the residual/idiosyncratic-momentum L1 (gated off behind RESIDUAL_MOMENTUM_ENABLED) to the component map. The W2.3 deferral, deferred feature-store home, and promotion criteria are documented in the module docstrings + the arc plan doc (alpha-engine-docs/private/, local). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Jun 1, 2026
…to main) The private/ / .claude/ / *.pkl / meta_* / tmp* / *.parquet / design-doc ignore rules added in #216 did not survive into main (last .gitignore change on main is unrelated) — so `git add -A` was dangerous again and machine-generated artifacts showed untracked. Restoring so the public repo can't be polluted by a stray `git add -A`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Jun 1, 2026
* feat(predictor): W3.2 leak-free per-horizon IC curve (observe) — L4469 Answers the operator's 5/21/60/90d question with the HONEST curve. The existing `horizon_diagnostic.curve` is pooled overlapping Spearman + a non-overlapping subsample (autocorrelation-inflated). W3.2 runs the SAME W1 purged+embargoed expanding walk-forward per horizon — refit the L2 per fold on each horizon's realized label (purge = h days) — and reports the leak-free cross-sectional rank IC + downside/DSR legs at each of _DIAGNOSTIC_HORIZONS (5/10/15/21/40/60/90, already present since 2026-04-15). - training/leakfree_meta_ic.py: new `leakfree_horizon_ic_curve()` helper (reuses leakfree_meta_oos_ic; per-horizon FINITE-MASK — the W1 function passes y straight into the L2 fit which rejects NaN, and longer-horizon labels carry tail-of-history NaN). - training/meta_trainer.py: build per-horizon labels over the canonical-finite rows, call the helper, add downside/DSR legs + the argmax-peak log line; emit to manifest `horizon_diagnostic.curve_leakfree` (additive per S3 contract). OBSERVE ONLY: gates nothing; the canonical 21d target is unchanged. Gross IC does not settle the horizon — net-of-cost judgment is W3.4 (turnover-adjusted alpha, deferred, needs a backtester cost model). The honest curve reads out on the next PredictorTraining run. Tests: signal-horizon-beats-noise + contract (one entry/horizon, purge=h, NaN-label handling). Suite 1317 → 1329. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(gitignore): restore the W2 hardening (lost in the #216 merge into main) The private/ / .claude/ / *.pkl / meta_* / tmp* / *.parquet / design-doc ignore rules added in #216 did not survive into main (last .gitignore change on main is unrelated) — so `git add -A` was dangerous again and machine-generated artifacts showed untracked. Restoring so the public repo can't be polluted by a stray `git add -A`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
W2 of the predictor-improvement arc (L4469): revives the dead raw-momentum L1 (WF median IC ≈ −0.001 in the live 5/30 manifest) with residual/idiosyncratic momentum + vol scaling (Blitz/Hanauer) plus a price-trend decomposition. Built OBSERVE-mode, gated OUT of the live ensemble — measured but not promoted.
Pairs with alpha-engine-data #356 — the features are computed in the data module (feature store), not the predictor (see "Feature sourcing" below). Merge #356 first.
How it stays "not auto-promoted" (the load-bearing guarantee)
model/residual_momentum_scorer.py). Trained + scored every run.build_train_meta_features(enabled)appendsresidual_momentum_scoreto the L2 feature list only whencfg.RESIDUAL_MOMENTUM_ENABLED(default false). TheMETA_FEATURESmodule constant is never mutated.meta_X/ persistedmeta_model._feature_names/ inference path are byte-identical to pre-W2 (proven bytests/test_meta_trainer_residual_momentum_gate.py).passes_wfis unchanged (vol_median_ic > 0) — residual momentum enters no promotion gate.manifest.json.residual_momentum_leakfree_oos_ic.Feature sourcing (corrected — the data module owns features)
The residual-momentum features are computed in
alpha-engine-data/features/feature_engineer.py(#356), reusing the existing beta-residualized return series (no beta recompute), and read from ArcticDB by the predictor like every other feature — the predictor computes nothing. The L1 consumesresidual_momentum_ratio/mom_12_1_pct/sector_mom_pctvialabeled.reindex(columns=...)(tolerant of missing columns until rematerialization, likerisk_chunks).Tests
Full suite green (gate-invariant, scorer contract/NaN/determinism/parity). Feature-correctness + leak-safety tests live in #356 (where the computation lives).
Commits
.gitignorehardening.NOT in this PR (deliberate)
Flipping the flag / promotion — that's a later action after 2–3 observe firings (composes with W1.4). Please review; do not merge-and-flip. Merge order: #356 → this.
🤖 Generated with Claude Code