fix(constituents): include tickers in collect() return dict#135
Merged
Conversation
PR #134's pre-MorningEnrich preflight calls ``constituents.collect()`` and reads ``cons_result.get("tickers", [])`` to feed prune_delisted_tickers' ``constituents_override`` and the daily_closes request list. But collect() was returning only ``{"status": "ok", "count": N}`` — no tickers — so the preflight got [] and MorningEnrich aborted with "No tickers available". 2026-05-02 SF redrive #5 was the live failure: prune correctly dropped all 8 stragglers (architectural fix worked\!), but then no tickers got fed to daily_closes. The whole MorningEnrich step exited 1. Add ``tickers`` to both happy-path returns (ok + ok_dry_run). Additive, no breakage: - ``_run_phase1`` (the only other caller) previously round-tripped to S3 to re-read what it just wrote — now uses ``const_result["tickers"]`` directly. - The dry-run fork in _run_phase1 (which separately called the private ``_fetch_constituents``) is also collapsed. Contract test in tests/test_constituents_sector_map.py locks both return shapes — sneaks-back protection for this exact regression class. 376 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
May 2, 2026
* feat(preflight): add sf_preflight.py — Saturday SF dry-rehearsal Predicts whether the Saturday SF would succeed BEFORE launching a spot. Today's recovery cycle (5 SF redrives, ~5 polygon API calls each) burned free-tier quota and operator hours discovering bugs sequentially. This module simulates the critical pre-Phase-1 path against real S3 + ArcticDB state and reports per-step pass/fail in ~30s with 1 polygon call total. Eight independent checks, mapped to today's incident stack: PR #130 (backfill regression) → check_backfill_source_freshness PR #131 (polygon coverage flake) → check_polygon_grouped_coverage PR #132 (missing-from-closes scoping) → check_predicted_missing_from_closes PR #133 (freshness scan scoping) → check_universe_sample_freshness PR #134 (workflow ordering) → check_universe_drift PR #135 (return shape) → check_constituents_fetch Postflight contracts → check_postflight_contracts ArcticDB reachability → check_arctic_connectivity Each check is a pure function taking a PreflightContext, returning a CheckResult. The orchestrator runs them all (catching per-check exceptions so one fail doesn't abort the suite) and emits human or JSON output. Exit code 1 on any failure. Two macOS-specific design notes: 1. ArcticDB libs are initialized once in check_arctic_connectivity and reused across downstream checks via the context — re-initializing adb.Arctic() crashes Aws::S3::S3Client::S3Client on macOS. 2. Checks are ordered with arctic_connectivity FIRST so its bundled AWS SDK loads before boto3 (which gets pulled in by collectors imports). Polygon check skips gracefully (WARN, not FAIL) when POLYGON_API_KEY is unset — supports laptop-side preflight where the .env isn't loaded. On the spot the key is present and the check fires. 18 tests in tests/test_sf_preflight.py — happy path + each failure mode each check is designed to catch + orchestrator isolation. 394 tests total. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(sf_preflight): set POLYGON_API_KEY in polygon-coverage tests CI runs without POLYGON_API_KEY in env, so the no-key skip-to-WARN guard short-circuited the 3 polygon-coverage tests before they reached the mocked client. Set the env var via monkeypatch so the guard passes through to the polygon mock. Also add explicit test for the no-key path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
4 tasks
cipher813
added a commit
that referenced
this pull request
May 28, 2026
…across all artifacts (#339) Closes the gap surfaced 2026-05-28: current-state probe answers 'is the artifact present now?' but operators also need 'did it land last weekend? are there gaps in the producer's history?' Filed per the same feedback memory observe_mode_unconditional_gates — absence-of-artifact is the failure mode, and a single-cycle absence could be a false-positive where a multi-cycle gap is a real producer regression. Adds: - event['mode']='historical' dispatch in handler(). Routes to a new _handle_historical(s3, now, started_at, lookback_overrides) path that walks the registry, probes the last N cycles per artifact, and writes _freshness_monitor/history.json (page 26 will surface per-row history expanders + gap counts). - New EB cron alpha-engine-freshness-monitor-historical-cron (daily 04:00 UTC, off-peak) wired in deploy.sh --bootstrap. - Default lookback: 12 saturday_sf + 30 weekday_sf/eod_sf cycles (~3 months each). continuous skipped (current-state covers). Tunable via event['lookback'] override. 403/404/NoSuchKey normalization: S3 returns 403 (not 404) for missing keys when the Lambda lacks s3:ListBucket. Treat both as cleanly-absent (no error_code in output) so page 26 doesn't show spurious '403 errors' on legitimately-absent historical cycles. 9 new unit tests cover: saturday/weekday/eod cycle-date resolution, continuous skip, zero-count short-circuit, date/trading_day/no-placeholder template rendering, and handler mode-dispatch. Live smoke (post-deploy + manual invoke): n_artifacts=51, n_cycles_probed=474, duration=10.08s Surfaced 1 real finding for follow-up: several artifacts use calendar-vs-trading-day-anchored templates that don't match producer behavior. research_signals registered as signals/{date}/signals.json with cadence=saturday_sf, but producer writes to mostly Friday trading-day keys (2026-05-22, 2026-05-15, etc.). The historical probe correctly reports the Saturday keys as absent — which IS the right answer given the registry template. ROADMAP follow-up filed separately to audit all registry templates for calendar-vs-trading-day mismatch. Calendar-naive by design — NYSE holidays surface as false-positive absent days but operators can interpret in context. Calendar-aware backfill is a P3 follow-up if the noise becomes worth the dependency lift. Composes with the OBSERVATION_REGISTRY arc (#349/#351/#352/#355 + #135/#136/#137). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cons_result.get("tickers", [])to feedprune_delisted_tickersand the daily_closes request list. Butcollect()was returning only{"status": "ok", "count": N}— no tickers. Preflight got[]→ "No tickers available for morning enrichment" → SF halt.tickersto both happy-path returns (ok + ok_dry_run). Additive — no caller breaks._run_phase1previously workaround was an S3 round-trip (re-read what was just written). Cleaned up to useconst_result["tickers"]directly.Test plan
test_collect_returns_tickers_in_dictlocks both return shapes — sneaks-back protection for this exact regression class🤖 Generated with Claude Code