Skip to content

feat(universe_returns): key on trading days, decouple from signal folders#39

Merged
cipher813 merged 1 commit into
mainfrom
feat/universe-returns-keyed-by-trading-day
Apr 15, 2026
Merged

feat(universe_returns): key on trading days, decouple from signal folders#39
cipher813 merged 1 commit into
mainfrom
feat/universe-returns-keyed-by-trading-day

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Decouples `universe_returns` from signal folder names. The collector now enumerates NYSE trading days directly (holiday-aware, via `trading_calendar.is_trading_day`) and populates one row per ticker per trading day within a configurable rolling window (default 90 trading days).

Builds on `#38` (merged earlier today): `INSERT OR REPLACE` + NULL-aware `_get_existing_dates` remain, so reprocessed dates actually overwrite stale NULL-return rows.

Why

The backtester's `_scanner_lift` / `_team_lift` / `_cio_lift` joins on `(ticker, eval_date)`. Previously `universe_returns.eval_date` came from signal folder names — sporadic, weekly, and subject to whatever stamping convention research was using (Saturday today, next_trading_day Monday, etc.). Every change to research's cadence or stamping produced a ripple through the join semantics.

After this PR, `universe_returns` has the grain it should have had from the start: "5d forward returns, one row per ticker per trading day." Evaluator joins succeed regardless of how often research runs or how it chose to stamp a given run.

Changes

  • `_trading_days_to_process(today, max_lookback, existing)` — new enumerator. Walks backward from today across `max_lookback` trading days, emits dates whose 5d forward window has closed AND aren't already populated. Uses the repo-root `trading_calendar.is_trading_day` (handles NYSE holidays through 2030).
  • `collect()` now uses the enumerator. `signals_prefix` arg is kept in the signature for call-site compat (`weekly_collector.py` still passes it) but is ignored.
  • New `max_lookback_trading_days` arg, default 90.
  • Removed `_list_signal_dates` (dead) and the local `_is_trading_day` helper (superseded).
  • Removed the `deferred` concept from the return dict — the enumerator guarantees every enqueued date has a closed window.

Test plan

  • Unit suite: 46/46
  • Enumerator smoke test with `today=2026-04-15, max_lookback=15` → returns 9 trading days (3/25 through 4/7, correctly skipping Good Friday 2026-04-03 and recent dates whose 5d window hasn't closed)
  • Next Saturday Step Function populates `universe_returns` for the last 90 trading days
  • Backtester `_scanner_lift` returns non-null `n_passing/n_universe` on the next grading run

Context

Discussion in 2026-04-15 session on the `eval_date` semantics mismatch that left `grading.json`'s `research.scanner` component stuck at `n_passing=0`. Three layered bugs; this is the third and most architecturally correct fix.

🤖 Generated with Claude Code

…ders

Rewrites the collector enumeration from "list signal folder dates in
S3 and process each" to "walk backwards through NYSE trading days and
process any whose 5d forward window has closed." The table's grain is
now "one row per ticker per trading day" — the natural grain for
downstream evaluation.

Why (discussion on 2026-04-15):

The backtester's _scanner_lift / _team_lift / _cio_lift joins on
(ticker, eval_date) between scanner_evaluations etc. and
universe_returns. Before this change, universe_returns eval_dates
came from signal folder names — sporadic and timestamped with
whatever research happened to stamp (run date, next trading day,
etc.). The join's behaviour then depended on whether research
happened to run that week and on which date-stamping convention
was in effect.

After this change, universe_returns stores a row for every NYSE
trading day in a 90-day rolling window, regardless of research
cadence. The evaluator can match any trading-day eval_date on the
scanner/team/cio side; schedule drift or research misfires no
longer blow holes in the grade surface.

Implementation:
- Added _trading_days_to_process(today, max_lookback, existing) that
  walks backwards through trading days (via the repo-root
  trading_calendar.is_trading_day for holiday awareness) and yields
  dates whose +5 business days window has closed and which are not
  already populated.
- collect() now uses the enumerator directly. The signals_prefix arg
  is kept in the signature for call-site compatibility (weekly_collector.py
  passes it) but is unused.
- Added max_lookback_trading_days arg (default 90) — enough for all
  rolling evaluator windows with plenty of slack.
- Removed _list_signal_dates (dead) and the local _is_trading_day
  helper (superseded by the holiday-aware shared module function).
- Removed the "deferred" concept from the return dict; the enumerator
  guarantees every enqueued date has a closed window.

Builds on PR #38's still-valid changes (INSERT OR REPLACE +
NULL-aware _get_existing_dates) which let reprocessed trading days
actually overwrite stale NULL-return rows.

Test suite: 46/46 unit tests pass. Smoke-tested _trading_days_to_process
with today=2026-04-15, max_lookback=15 → returns 9 trading days
(3/25 through 4/7, correctly skipping Good Friday 2026-04-03 and
recent dates whose 5d window hasn't closed).
@cipher813 cipher813 merged commit 38df4e9 into main Apr 15, 2026
1 check passed
@cipher813 cipher813 deleted the feat/universe-returns-keyed-by-trading-day branch April 15, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant