Skip to content

feat(freshness-monitor): historical-mode probe — daily gap detection across all artifacts#339

Merged
cipher813 merged 1 commit into
mainfrom
feat/freshness-monitor-historical-mode
May 28, 2026
Merged

feat(freshness-monitor): historical-mode probe — daily gap detection across all artifacts#339
cipher813 merged 1 commit into
mainfrom
feat/freshness-monitor-historical-mode

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Closes the gap surfaced 2026-05-28: current-state probe answers "is it present now?" but operators need "are there gaps in the producer's history?" The historical probe walks each artifact's last N cycles, HEAD-checks them, and writes _freshness_monitor/history.json for page 26 to surface per-row history + gap counts.

Architecture

  • New event['mode']='historical' dispatch in handler(). Same Lambda, different event branch — separate code path; current-state probe stays fast (15-min cadence).
  • New EB cron alpha-engine-freshness-monitor-historical-cron (daily 04:00 UTC, off-peak) wired in deploy.sh --bootstrap with target Input {"mode":"historical"}.
  • Default lookback: 12 saturday_sf + 30 weekday_sf/eod_sf cycles (~3 months each). continuous skipped (current-state already covers). Tunable via event['lookback'] override.
  • Output shape per artifact: {cadence, severity, owner_repo, s3_key_template, is_latest_pointer, lookback_cycles, gap_count, continuous, history:[{date, present, size?, last_modified?, error_code?}, ...]}.
  • 403/404/NoSuchKey normalization: S3 returns 403 (not 404) for missing keys when Lambda lacks s3:ListBucket. Both treated as cleanly-absent (no error_code in output) so page 26 doesn't show spurious "403 errors".
  • Calendar-naive by design — NYSE holidays may surface as false-positive absent days; operator interprets in context. Calendar-aware probe is a P3 if noise becomes worth the dependency.

Test plan

  • 9 new unit tests (saturday/weekday/eod cycle-date resolution, continuous skip, zero-count short-circuit, date/trading_day/no-placeholder template rendering, handler mode-dispatch)
  • Full suite: 21 passed (12 prior + 9 new)
  • Live smoke (manual invoke after deploy): n_artifacts=51, n_cycles_probed=474, duration=10.08s
  • 403→absent normalization verified live (no error_code in output after redeploy)

Real finding surfaced by this probe

research_signals is registered with s3_key_template: signals/{date}/signals.json + cadence: saturday_sf, but the producer writes to mostly Friday trading-day keys (2026-05-22, 2026-05-15, 2026-05-08...). The historical probe correctly reports the Saturday keys as absent — that IS the right answer given the registry template. ROADMAP follow-up: audit all registry templates for calendar-vs-trading-day mismatch. Multiple artifacts probably have the same issue.

This is exactly the bug class the probe was designed to catch. Composes with [[feedback_observe_mode_unconditional_gates_govern_cutover]].

Composes with

Follow-on PR

alpha-engine-dashboard PR for page 26 per-row history expander + gap count summary (reads history.json written by this Lambda).

🤖 Generated with Claude Code

…across all artifacts

Closes the gap surfaced 2026-05-28: current-state probe answers
'is the artifact present now?' but operators also need 'did it
land last weekend? are there gaps in the producer's history?'
Filed per the same feedback memory observe_mode_unconditional_gates
— absence-of-artifact is the failure mode, and a single-cycle
absence could be a false-positive where a multi-cycle gap is a
real producer regression.

Adds:

- event['mode']='historical' dispatch in handler(). Routes to a
  new _handle_historical(s3, now, started_at, lookback_overrides)
  path that walks the registry, probes the last N cycles per
  artifact, and writes _freshness_monitor/history.json (page 26
  will surface per-row history expanders + gap counts).
- New EB cron alpha-engine-freshness-monitor-historical-cron
  (daily 04:00 UTC, off-peak) wired in deploy.sh --bootstrap.
- Default lookback: 12 saturday_sf + 30 weekday_sf/eod_sf cycles
  (~3 months each). continuous skipped (current-state covers).
  Tunable via event['lookback'] override.

403/404/NoSuchKey normalization: S3 returns 403 (not 404) for
missing keys when the Lambda lacks s3:ListBucket. Treat both as
cleanly-absent (no error_code in output) so page 26 doesn't show
spurious '403 errors' on legitimately-absent historical cycles.

9 new unit tests cover: saturday/weekday/eod cycle-date
resolution, continuous skip, zero-count short-circuit,
date/trading_day/no-placeholder template rendering, and handler
mode-dispatch.

Live smoke (post-deploy + manual invoke):
  n_artifacts=51, n_cycles_probed=474, duration=10.08s

Surfaced 1 real finding for follow-up: several artifacts use
calendar-vs-trading-day-anchored templates that don't match
producer behavior. research_signals registered as
signals/{date}/signals.json with cadence=saturday_sf, but
producer writes to mostly Friday trading-day keys (2026-05-22,
2026-05-15, etc.). The historical probe correctly reports the
Saturday keys as absent — which IS the right answer given the
registry template. ROADMAP follow-up filed separately to audit
all registry templates for calendar-vs-trading-day mismatch.

Calendar-naive by design — NYSE holidays surface as
false-positive absent days but operators can interpret in
context. Calendar-aware backfill is a P3 follow-up if the
noise becomes worth the dependency lift.

Composes with the OBSERVATION_REGISTRY arc (#349/#351/#352/#355
+ #135/#136/#137).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 46882e9 into main May 28, 2026
1 check passed
@cipher813 cipher813 deleted the feat/freshness-monitor-historical-mode branch May 28, 2026 15:46
cipher813 added a commit that referenced this pull request May 28, 2026
…or historical probe (#341)

Closes the calendar-vs-trading-day mismatch surfaced by PR #339's
historical probe — research_signals was registered as
signals/{date}/signals.json + cadence=saturday_sf but the producer
writes to Friday trading-day keys (signals/2026-05-22/, /05-15/,
/05-08/...). The pre-fix probe correctly reported the Saturday keys
absent — but the operator-facing display showed 9/12 gaps when the
true gap rate was lower; the registry just wasn't asking the right
question.

Restructure:

- _iter_sf_firing_dates: returns the SF cron's calendar firing dates
  (last N Saturdays for saturday_sf, last N Mon-Fri for weekday_sf /
  eod_sf). No change from prior calendar-naive behavior at this layer.
- _resolve_axis_dates: NEW. Translates firing dates to the date-axis
  the template actually uses. {date} → calendar firing date; {trading_day}
  → previous_trading_day(firing_date) for saturday_sf/weekday_sf, or
  firing_date itself for eod_sf (EOD writes today's close).
  alpha_engine_lib.trading_calendar.previous_trading_day IS NYSE-
  holiday-aware, so Memorial-Day-style holidays resolve correctly.
- _iter_historical_cycle_dates: now takes optional template arg and
  composes the two helpers. Backward-compat: callers omitting template
  get calendar-axis (pre-PR behavior).
- _handle_historical: passes spec.s3_key_template to the resolver.

6 new unit tests cover the trading_day-axis path for all three
cadences + NYSE-holiday skip via the lib + backward-compat for the
template-less call signature. Full suite: 27 passed (21 prior + 6
new).

Live smoke (post-deploy + historical invoke after companion
alpha-engine-config registry flip):
  research_signals — 12/12 gaps (under {date}) → 5/12 gaps (under
  {trading_day}). 4 cycles correctly recovered.

Composes with alpha-engine-config registry PR (this branch's
sibling): flips research_signals, research_consolidated_morning,
scanner_candidates_json from {date} → {trading_day}. 4 backtest_*
entries documented + held at {date} pending producer-side audit
(backtester writes to ad-hoc current-date-of-write, neither {date}
nor {trading_day}).

Per the system's now_dual() convention codified in
alpha-engine-docs/private/DATE_CONVENTIONS.md — trading_day =
last_closed_trading_day(now); this PR brings the registry +
historical probe into compliance with that convention.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 28, 2026
…342)

* feat(freshness-monitor): template-aware trading-day-axis resolution for historical probe

Closes the calendar-vs-trading-day mismatch surfaced by PR #339's
historical probe — research_signals was registered as
signals/{date}/signals.json + cadence=saturday_sf but the producer
writes to Friday trading-day keys (signals/2026-05-22/, /05-15/,
/05-08/...). The pre-fix probe correctly reported the Saturday keys
absent — but the operator-facing display showed 9/12 gaps when the
true gap rate was lower; the registry just wasn't asking the right
question.

Restructure:

- _iter_sf_firing_dates: returns the SF cron's calendar firing dates
  (last N Saturdays for saturday_sf, last N Mon-Fri for weekday_sf /
  eod_sf). No change from prior calendar-naive behavior at this layer.
- _resolve_axis_dates: NEW. Translates firing dates to the date-axis
  the template actually uses. {date} → calendar firing date; {trading_day}
  → previous_trading_day(firing_date) for saturday_sf/weekday_sf, or
  firing_date itself for eod_sf (EOD writes today's close).
  alpha_engine_lib.trading_calendar.previous_trading_day IS NYSE-
  holiday-aware, so Memorial-Day-style holidays resolve correctly.
- _iter_historical_cycle_dates: now takes optional template arg and
  composes the two helpers. Backward-compat: callers omitting template
  get calendar-axis (pre-PR behavior).
- _handle_historical: passes spec.s3_key_template to the resolver.

6 new unit tests cover the trading_day-axis path for all three
cadences + NYSE-holiday skip via the lib + backward-compat for the
template-less call signature. Full suite: 27 passed (21 prior + 6
new).

Live smoke (post-deploy + historical invoke after companion
alpha-engine-config registry flip):
  research_signals — 12/12 gaps (under {date}) → 5/12 gaps (under
  {trading_day}). 4 cycles correctly recovered.

Composes with alpha-engine-config registry PR (this branch's
sibling): flips research_signals, research_consolidated_morning,
scanner_candidates_json from {date} → {trading_day}. 4 backtest_*
entries documented + held at {date} pending producer-side audit
(backtester writes to ad-hoc current-date-of-write, neither {date}
nor {trading_day}).

Per the system's now_dual() convention codified in
alpha-engine-docs/private/DATE_CONVENTIONS.md — trading_day =
last_closed_trading_day(now); this PR brings the registry +
historical probe into compliance with that convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(email): delegate SMTP+SES dispatch to alpha_engine_lib.email_sender (L4356)

`emailer.py::send_step_email` now builds subject/body/HTML and delegates
the Gmail SMTP + SES fallback dispatch to
`alpha_engine_lib.email_sender.send_email` (the L4356 chokepoint).

Drops the local 50-line dispatch boilerplate + the `smtplib`/`boto3`
imports. Same secret-resolution semantics
(`EMAIL_SENDER`/`EMAIL_RECIPIENTS`/`GMAIL_APP_PASSWORD`/`AWS_REGION`),
same Gmail-primary-SES-fallback ordering, same never-raises contract.

ROADMAP: **L4356 part 1/5** — sibling PRs follow in
alpha-engine-backtester, alpha-engine, alpha-engine-research, and
alpha-engine-predictor.

Suite: 1675 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 28, 2026
)

deploy.sh --bootstrap was tripping on the put-targets shorthand when
the Input contains JSON. From PR #339 the historical cron's target
carries Input={"mode":"historical"}; the shorthand form
Id=N,Arn=...,Input={...} confuses argparse on the embedded quotes +
comma.

Caught live 2026-05-28 when --bootstrap re-run partial-fired:
  ParamValidation: Error parsing parameter '--targets': Expected:
  '=', received: '"' for input

Switch to a temp-file JSON spec via file://. The shorthand for the
original 15-min cron stays unchanged because that target has no Input
payload.

The historical EB cron is wired live in AWS — target attached
manually via direct CLI during this session with the same JSON.
First daily firing at 04:00 UTC. This commit ensures the next
operator who runs --bootstrap from main doesn't trip the same trap.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant