fix(backfill): filter universe writes by current constituents + escalate preflight#138
Merged
Merged
Conversation
…ate preflight Closes the prune+backfill loop that recreated 7 S&P churn-out stragglers on every Saturday SF run. 2026-05-02 redrive #6 surfaced the loop: pre-MorningEnrich prune (PR #134, absent_days=5) drops stragglers ✓; Phase 1 step 8 (builders.backfill) loads ALL predictor/price_cache/*.parquet files and writes EVERY ticker back to ArcticDB universe — including the ones we just pruned, because their parquet files still exist (kept for historical lookup). Loop closes; Backtester preflight (~2 hours later) trips on the 8-day-stale rows. ## Fix 1: backfill respects current constituents In ``builders.backfill``, load current constituents via the ``market_data/latest_weekly.json`` pointer and filter ``universe_tickers`` against it. Tickers absent from constituents (churn-outs) get a price_cache parquet preserved (history kept) but NO arctic row written. If a ticker comes back to S&P later, it appears in constituents and backfill picks it up automatically. Hard-fails on constituents-load failure (vs silently writing everything) per feedback_no_silent_fails. Skipped in dry_run so local smoke tests don't need S3 access. ## Fix 2: sf_preflight escalates straggler detection ``check_universe_drift`` now returns FAIL (not OK) when any straggler is "old enough to prune" (>5 days stale). Forces operators to drop stragglers BEFORE launching recovery SFs that skip MorningEnrich (would otherwise burn a 120-min Backtester spot to re-discover them). Result includes a remediation hint pointing at the prune CLI. Validation against current state (post manual prune of 7): [OK] universe_drift 1 arctic stragglers; 0 would be pruned 3 new tests in test_backfill_no_regression.py: - backfill_skips_tickers_absent_from_constituents (the loop closure) - backfill_hard_fails_when_constituents_load_fails (no silent recreate-everything fallback) - backfill_dry_run_does_not_filter_by_constituents (CI / smoke doesn't need S3) Existing test scaffolding updated to mock _load_current_constituents across both backfill test files. 406 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
cipher813
added a commit
that referenced
this pull request
May 6, 2026
Companion to alpha-engine-backtester PR #138 (concordance Lambda implementation). Inserts the Saturday SF state that fires the weekly cheap-model concordance pipeline. SF chain after this PR: ... → EvalJudge{Weekly,FirstSaturday} → EvalRollingMean → CheckSkipRationaleClustering → RationaleClustering → CheckSkipReplayConcordance → ReplayConcordance → SaturdayHealthCheck → ... Independent skip-gates per observability path: - {"skip_eval_judge": true} — bypass judge only - {"skip_rationale_clustering":...} — bypass clustering only (was rerouted from SaturdayHealthCheck to CheckSkipReplayConcordance in this PR; the two are independent agent-justification signals) - {"skip_replay_concordance": true} — bypass concordance only Each Catch lands at SaturdayHealthCheck (eval observability — failures must NOT halt the pipeline). Default ReplayConcordance payload pins production cadence: - target_models: ["claude-haiku-4-5"] (Sonnet→Haiku concordance) - window_days: 56 (8 weeks trailing) - max_artifacts: 150 (fits 900s Lambda timeout at ~3-5 sec/call) Operator overrides via SF input parameters for ad-hoc runs against different target models (e.g. claude-sonnet-4-6 self-test). IAM updates: - github-actions-lambda-deploy.json: alpha-engine-replay-concordance added to LambdaUpdate + LambdaInvokeCanary (asymmetric-IAM-grant antipattern compliance — 4th time this shape, durable fix is the CreateFunction grant from PR #165 already covers create+update). - deploy_step_function.sh: SF role inline LambdaInvoke list updated with the new function ARN so SF can invoke it. Tests: - TestStatesPresent: CheckSkipReplayConcordance + ReplayConcordance required. - TestSkipRationaleClustering: rerouted assertion (now lands at CheckSkipReplayConcordance, not SaturdayHealthCheck). - TestRationaleClustering: success + Catch reroutes (now CheckSkipReplay Concordance). - TestSkipReplayConcordance: skip_replay_concordance flag → SaturdayHealthCheck. - TestReplayConcordance: live alias, payload required fields (end_time_iso, target_models, window_days, max_artifacts), 900s timeout matches Lambda cap, success + Catch routes, retry posture. Suite 451 → 459. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the prune+backfill loop that recreated 7 S&P churn-out stragglers on every Saturday SF run. 2026-05-02 redrive #6 surfaced the loop:
absent_days=5) drops stragglers ✓builders.backfill) loads ALLpredictor/price_cache/*.parquetand writes EVERY ticker to ArcticDB universe — including the ones we just pruned, because their parquet files still exist (kept for historical lookup)Two fixes
builders/backfill.py: constituents-filter on writesLoad current constituents via the
latest_weekly.jsonpointer and filteruniverse_tickersagainst it. Churn-out parquet preserved (history kept) but no arctic row written. If a ticker returns to S&P, backfill picks it back up automatically. Hard-fails on constituents-load failure (no silent recreate-everything fallback).sf_preflight.check_universe_drift: WARN → FAILNow returns FAIL when any straggler is
>5d stale. Forces operators to drop stragglers BEFORE launching recovery SFs that skip MorningEnrich (otherwise we burn a 120-min Backtester spot to re-discover them). Result includes remediation hint.Validation against current state (post manual prune)
Test plan
test_backfill_no_regression.py: loop closure, hard-fail on constituents-load failure, dry-run doesn't require S3check_universe_driftto expect FAIL on stragglers + remediation hintWhy not also add prune to Backtester?
Considered but rejected — the issue isn't Backtester missing a prune step, it's that the EXISTING DataPhase1 prune was getting undone by DataPhase1's own backfill. Fixing the loop in one place is cleaner than duplicating prune across modules.
🤖 Generated with Claude Code