feat(freshness): Phase 5 PR 8 — Artifact Freshness console page + System Health KPI strip#134
Merged
Merged
Conversation
…tem Health KPI strip
Phase 5 PR 8 (final PR in the arc, modulo Phase 6 soak + cutover) of
the artifact-freshness-monitor arc (plan doc:
~/Development/alpha-engine-docs/private/artifact-freshness-monitor-260527.md).
Closes the operator-surface dimension of the arc — Phase 3's
freshness-monitor Lambda writes the artifacts; this PR is the
consumer.
Changes:
- pages/26_Artifact_Freshness.py (new) — dedicated console page for
per-artifact red/yellow/green at a glance. Reads:
* s3://alpha-engine-research/_freshness_monitor/heartbeat.json
(Lambda self-heartbeat — last run, aggregate counts, alerts_enabled)
* s3://alpha-engine-research/_freshness_monitor/check_results.json
(per-spec rows: state, last-modified, SLA-breach minutes, reason)
Surface:
* Top KPI strip — last-run age + per-state counts + mode
(OBSERVE-only vs alerts-live).
* OBSERVE-mode banner when alerts_enabled=false, with the
cutover command spelled out.
* Filters by owner_repo / cadence / severity / state.
* Sortable table with color-coded state badges
(probe_failed > missing > stale > grace > fresh; severity-bumped
within state).
* Operator runbook in an expander — common causes for probe_failed
/ missing, plus the force-invoke recipe.
- pages/4_System_Health.py — new "Artifact Freshness Monitor" section
at top of page (right under page caption). Same heartbeat-derived
KPI strip + a link to the dedicated /Artifact_Freshness page for
drill-down. Gracefully no-ops when heartbeat is absent (Lambda
hasn't deployed yet, or live registry is empty).
Companion to:
- alpha-engine-lib #83 (merged) — substrate (ArtifactSpec, check_freshness,
resolve_dedup_key) at v0.40.0
- alpha-engine-config #344 (merged) — registry SoT (48 entries, 27
grandfathered prefixes) + PR-time validator
- alpha-engine-data #335 (merged) — freshness-monitor Lambda + EB cron
- alpha-engine-data #336 (merged) — producer-side CI guard
- alpha-engine-research #243, alpha-engine-predictor #204,
alpha-engine-backtester #256 (open) — producer-side CI guards
(Phase 4 cascade complete)
Phase 6 cutover (operator-driven, no PR): ≥2 weekly cycles in OBSERVE
mode → env-var flip via
`aws lambda update-function-configuration --environment
'Variables={MNEMON_FRESHNESS_MONITOR_ENABLED=true,...}'` —
mirrors the mnemon 0.7.0rc4 pattern from 2026-05-24.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deploying nousergon-marketing with
|
| Latest commit: |
f63cd4b
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://29d1fe8c.nousergon-marketing.pages.dev |
| Branch Preview URL: | https://feat-artifact-freshness-page.nousergon-marketing.pages.dev |
cipher813
added a commit
that referenced
this pull request
May 28, 2026
… PyArrow-backed null columns (#136) Page 26 crashed with `TypeError: fromisoformat: argument must be str` once the freshness-monitor Phase 6 bootstrap landed live data this afternoon. Stack pointed at `filtered["last_modified"].apply(_format_age)` (line 208). Root cause: pandas reads `check_results.json`'s mixed null/string `last_modified` column with the PyArrow backend, which represents JSON nulls as `pd.NA` rather than Python `None`. The function's `if not iso_ts:` truthiness check doesn't reliably bail on `pd.NA` for every dtype path, AND `datetime.fromisoformat(pd.NA)` raises `TypeError` rather than `ValueError`, so the existing `except ValueError` doesn't catch it and the page crashes. This was a latent bug since page 26 shipped (alpha-engine-dashboard PR #134, 2026-05-27 follow-on session) — but it didn't surface until this afternoon's Phase 6 bootstrap landed live `check_results.json` with 49/51 null `last_modified` values (grace_period entries that hadn't been probed yet on the cold-start cycle). Fix: explicit `isinstance(iso_ts, str)` type-check at function entry + broaden the except clause to `(ValueError, TypeError)`. Tested against all input shapes: pd.NA / None / empty str / valid ISO / garbage str / datetime object — all behave correctly. Per [[feedback_observe_mode_unconditional_gates_govern_cutover]] this is exactly the bug class the freshness-monitor arc exists to catch structurally — surfaced at first real load, not at deploy time. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
May 28, 2026
Surfaces the per-cycle history for each artifact, reading the new
_freshness_monitor/history.json written daily at 04:00 UTC by the
freshness-monitor Lambda's historical mode (alpha-engine-data PR #339).
Closes the gap surfaced 2026-05-28: 'are there gaps in the producer's
history?' — operators want to know not just current-cycle state but
whether last weekend / last month had silent absences.
Changes:
- _load_history loader (TTL 300s — refreshes once/day, not 15min)
- New History (12wk) column on the main table:
✅ N/N continuous — clean history
⚠️ G/N gaps — gappy producer
✅ exists (latest) — latest-pointer present
❌ absent (latest) — latest-pointer missing
— — historical probe hasn't covered this id yet
(continuous-cadence artifacts skip historical mode)
- New 'Per-artifact history drill-down' section below the main
table. Each artifact in the filtered view gets an expander
showing the per-cycle sequence (date / present / size /
last_modified / error_code). Sort: gappy first, continuous last;
latest-pointer absent at top, latest-pointer present at bottom.
First 3 worst-offender entries auto-expand.
- Graceful-degrade: if history.json doesn't exist yet, page shows
a single info box explaining the daily cron + manual-invoke
instructions.
Operator caveat: calendar-naive. NYSE holidays may render as
false-positive ❌ absent cells. Calendar-aware probe is a future
enhancement (P3 in the Lambda PR).
Composes with alpha-engine-data PR #339 (historical-mode Lambda)
+ the prior page 26/27 work in #134/#135/#136/#137.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 5 PR 8 (final code PR in the arc; Phase 6 is operator-driven soak + env-var cutover) of the artifact-freshness-monitor arc (plan doc at
~/Development/alpha-engine-docs/private/artifact-freshness-monitor-260527.md). Closes the operator-surface dimension — Phase 3's freshness-monitor Lambda writes the artifacts; this PR is the console consumer.Changes
pages/26_Artifact_Freshness.py(new) — dedicated console page. Reads:s3://alpha-engine-research/_freshness_monitor/heartbeat.json(Lambda self-heartbeat: last run, aggregate counts, alerts_enabled)s3://alpha-engine-research/_freshness_monitor/check_results.json(per-spec rows: state, last-modified, SLA-breach minutes, reason)Surface:
alerts_enabled=false, with the cutover command spelled outowner_repo/cadence/severity/stateprobe_failed/missing, plus the force-invoke recipepages/4_System_Health.py— new "Artifact Freshness Monitor" section at top of page. Same heartbeat-derived KPI strip + a link to/Artifact_Freshnessfor drill-down. Gracefully no-ops when heartbeat is absent (Lambda hasn't deployed yet, or registry is empty).Arc-wide status (post-merge of this PR)
artifact_freshnesssubstrate v0.40.0Phase 6 cutover (operator-driven, no PR): ≥2 weekly cycles in OBSERVE mode (earliest cutover ~2026-06-13; more realistically ~2026-06-20). Cutover via:
aws lambda update-function-configuration \ --function-name alpha-engine-freshness-monitor \ --environment 'Variables={LOG_LEVEL=INFO,MNEMON_FRESHNESS_MONITOR_ENABLED=true}'Mirrors the mnemon 0.7.0rc4 pattern from 2026-05-24 — env-var flip without redeploy.
Test plan
python3 -c "import ast; ast.parse(open('pages/26_Artifact_Freshness.py').read())"— syntax OKpython3 -c "import ast; ast.parse(open('pages/4_System_Health.py').read())"— syntax OK/Artifact_Freshnessafter first cron firing; verify KPI strip + table render with real data/System_Health; verify new Artifact Freshness Monitor section appears at topDeploy
Manual code-only deploy after merge:
ae-dashboard "sudo systemctl start boot-pull && sudo systemctl restart dashboard && sudo systemctl restart nous-ergon-public"Per CLAUDE.md
## Dashboardsection:boot-pull.shonly auto-restarts services whose.serviceunit file changed, so code-only PRs need explicit restarts on bothdashboard(console) andnous-ergon-publicservices.🤖 Generated with Claude Code