Skip to content

Fetch macro tickers daily via polygon → FRED → yfinance#34

Merged
cipher813 merged 1 commit into
mainfrom
fix/daily-macro-fetch
Apr 15, 2026
Merged

Fetch macro tickers daily via polygon → FRED → yfinance#34
cipher813 merged 1 commit into
mainfrom
fix/daily-macro-fetch

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

  • Fixes P1 incident docs/incident-2026-04-15-macro-stale.md — daily pipeline was never wired to update the ArcticDB macro library
  • _run_daily() now passes macro ETFs + ^-prefix indices to daily_closes.collect(); builders/daily_append.py Section 5 then writes them to the ArcticDB macro library every weekday
  • Adds a FRED fallback layer between polygon and yfinance for the 4 index tickers (VIX, VIX3M, TNX, IRX) not available on polygon free tier
  • Scale validated: FRED VIXCLS/VXVCLS/DGS10/DTB3 publish in the same units as yfinance ^-prefix tickers, so no downstream feature changes

Source priority per ticker

  1. polygon.io grouped-daily — ETFs (SPY, GLD, USO, XL*)
  2. FRED — indices (VIX, VIX3M, TNX, IRX)
  3. yfinance — whatever remains

Each source logs a captured/missed count so failures are observable.

Test plan

  • Unit tests pass (61/61)
  • FRED smoke test captures all 4 indices with expected scales (VIX=18.36, TNX=4.3, IRX=3.62, VIX3M=20.82)
  • Post-merge: ae-dashboard "cd ~/alpha-engine-data && git pull"
  • Re-run today's weekday Step Function
  • Verify ArcticDB macro/SPY + VIX last dates match run date
  • PredictorInference Lambda completes without preflight RuntimeError

🤖 Generated with Claude Code

The daily pipeline was never wired to update the ArcticDB macro library.
macro_keys in builders/daily_append.py Section 5 always hit closes.get(key)
is None, since _run_daily() passed only S&P constituents to
daily_closes.collect(). Macro series (SPY/VIX/TNX/IRX/GLD/USO + XL* sector
ETFs) only refreshed on Saturday Phase 1, which left them 4–10+ days stale
by mid-week and tripped PredictorPreflight on 2026-04-15.

Changes:
- weekly_collector._run_daily: extend tickers with MACRO_DAILY_TICKERS
  (ETFs + ^-prefix indices) before daily_closes.collect.
- collectors/daily_closes: insert FRED fallback between polygon and yfinance
  for the 4 index tickers not on polygon free tier (VIX, VIX3M, TNX, IRX).
  FRED series VIXCLS/VXVCLS/DGS10/DTB3 publish in the same scale as yfinance
  ^-prefix tickers (raw level for VIX/VIX3M, percent for TNX/IRX), so
  downstream feature scaling is unchanged.

Source priority per ticker: polygon grouped-daily (ETFs) → FRED (indices) →
yfinance (whatever remains). Each source logs a miss count so failures are
observable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 7f7564c into main Apr 15, 2026
1 check passed
@cipher813 cipher813 deleted the fix/daily-macro-fetch branch April 15, 2026 15:14
cipher813 added a commit that referenced this pull request May 9, 2026
Saturday SF DataPhase1 PARTIAL run on 2026-05-09 fired no per-failure
Flow Doctor alert. The arcticdb backfill regression was stored in the
result dict but never logged at ERROR level — only main()'s generic
"Weekly collection finished with non-ok status=partial" summary fires,
which produces a single dedup signature across every partial run and
contains no actual error text for Flow Doctor's LLM diagnose pipeline.

Fix: _finalize() now calls
alpha_engine_lib.collector_results.report_collector_errors(), which
emits one logger.error() per error-status entry with the collector name
+ original message. Each emitted record carries a distinct dedup
signature, restoring per-failure alert granularity.

- Pin alpha-engine-lib v0.5.1 → v0.6.2 (helper landed in lib PR #34)
- Wire call inside _finalize after status computation, before manifest
  write — fires for every code path that finalizes (phase 1, phase 2,
  daily, morning enrich) and runs even if postflight raises afterward
- New wiring test test_collector_error_visibility.py pins the call so
  a future refactor can't silently drop it

552 unit tests pass locally.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 27, 2026
…58) (#327)

Closes the mocked-test scope-shape gap for the three external services
alpha-engine-data depends on. Mirrors the morning-signal #34 pattern
shipped 2026-05-26 for the same class of bug: unit tests mock the
external client, so payload-shape drift (field renames, schema
deprecations, status-code semantics) is invisible to CI until
production fires.

Each smoke is its own paths-filtered workflow + skip-on-no-credentials
script, so PRs that don't touch the relevant module skip the workflow
entirely and forks without secrets get a clean skip rather than a
failing CI status.

Smokes:
- Polygon: get_grouped_daily for the most recent US weekday; asserts
  every bar carries the {open, high, low, close, volume, vwap} keys
  the consumer reads. ~$0.01/run, gated on POLYGON_API_KEY.
- FRED: fetch_fred_history("DGS2", period_years=1); asserts >=50
  observations and "value" column present. Free tier, gated on
  FRED_API_KEY.
- ArcticDB: read tail of SPY from universe library; asserts the
  canonical OHLCV_COLS + PROVENANCE_COL schema. Read-only, no writes.
  Gated on OIDC role assumption (github-actions-lambda-deploy).

IAM grant: adds two scoped Statements to
infrastructure/iam/github-actions-lambda-deploy.json:
- ArcticDBSmokeReadObject: s3:GetObject on arcticdb/* (read-only)
- ArcticDBSmokeListBucket: s3:ListBucket with prefix condition

Operator-step on merge: `./infrastructure/iam/apply.sh
github-actions-lambda-deploy` to push the new policy to AWS. The
IAM drift check will fail until that runs.

Secrets to add in GHA repo settings: POLYGON_API_KEY, FRED_API_KEY.

Composes with morning-signal #34, alpha-engine-lib #78 (anthropic_payload
chokepoint), and the L258 P0-retrospective entry in ROADMAP.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 27, 2026
…342 PR 2) (#330)

Migrates infrastructure/spot_data_weekly.sh off SSH+SCP onto the
alpha-engine-lib v0.35.0+ `ssm_dispatcher` chokepoint
(`python -m alpha_engine_lib.ssm_dispatcher run`). Closes the (i)
alive-SSH-path finding from the 2026-05-24 audit; PR 2 of the 5-PR
ROADMAP L342 arc.

Transport changes:
- Wait-for-SSH loop → wait-for-SSM-Online (`aws ssm describe-instance-
  information` polling, 180s budget, mirrors predictor #168 pattern)
- `run_remote "..."` (ssh-based) → `run_ssm "<desc>" <timeout> <<HEREDOC`
  (lib CLI via --script-stdin)
- SCP config upload → S3 staging: dispatcher uploads
  alpha-engine-config/data/config.yaml to
  tmp/spot_data_weekly/<run_id>/config.yaml; spot pulls via existing
  alpha-engine-executor-profile IAM role's s3:GetObject grant
- REMOTE_PYTHON captured via SSH → PYTHON_BIN resolved inline per SSM
  step (`command -v python3.12 || command -v python3`)
- KEY_FILE / SSH_OPTS removed; KEY_NAME kept ONLY as launch attribute
  for alpha_engine_lib.ec2_spot's --key-name flag (break-glass operator
  SSH only — port-22 SG revoke is PR 5 of the arc)

Why pipe heredoc via --script-stdin instead of mirror predictor's inline
`"$(cat <<HEREDOC ... HEREDOC)"` pattern: the data path's RAG-secrets
block contains `aws ssm get-parameter --query 'Parameter.Value' ...`
inside `$(...)`. The outer command-substitution scanner sees the inner
single quotes and breaks. The lib CLI's --script-stdin flag reads the
body verbatim, so the dispatcher's bash parser never scans the inner
script for quote/paren balance. Future PRs adopting the lib chokepoint
(ssm_dispatcher) should prefer --script-stdin for any non-trivial
spot-side script body.

Cleanup: dispatcher trap also removes the S3 staging prefix on EXIT
(belt-and-suspenders — S3 lifecycle on tmp/ is the authoritative
purger).

CI guards (tests/test_spot_data_weekly_ssm_transport.py, 8 new tests):
- test_spot_data_weekly_script_exists — script presence
- test_no_top_level_ssh_invocation — no `ssh -X` / `ssh ` outside comments
- test_no_top_level_scp_invocation — no `scp -X` / `scp ` outside comments
- test_no_ssh_keyscan_invocation — no `ssh-keyscan github.com` re-introduce
- test_uses_lib_ssm_dispatcher_chokepoint — `alpha_engine_lib.ssm_dispatcher`
  present (catches a regression that replaces it with `aws ssm send-command`)
- test_no_inline_aws_ssm_send_command — no direct `aws ssm send-command`
  (the predictor #168 pre-lift pattern L342 explicitly lifts to lib)
- test_stages_config_via_s3 — `aws s3 cp ... config.yaml` present
- test_no_residual_key_file_dispatch_use — no $KEY_FILE / $SSH_OPTS in
  non-comment lines (KEY_NAME stays as launch attribute, allow-listed)

Test fixture updates (no behavior change):
- test_spot_env_source_aws_region.py — multi-line `read -r -d ''
  ENV_SOURCE <<'ENV_EOF' ... ENV_EOF` recognized in addition to the
  single-line `ENV_SOURCE="..."` shape. The semantic invariant
  (ENV_SOURCE exports AWS_REGION + AWS_DEFAULT_REGION) is unchanged.
- test_preflight_only_dry_path.py — accept `run_ssm "workloads"` as
  the workloads opener in addition to the pre-migration
  `run_remote bash -s <<WORKLOADS`.

Suite: 1618 → 1626 passed (+8).

Operator notes:
- The spot's IAM profile (alpha-engine-executor-profile) already
  grants AmazonSSMManagedInstanceCore via the predictor #168 migration
  (lib pin v0.35.0+ ships in same profile); no IAM changes needed
  here.
- Saturday SF first exercise: next Saturday SF firing
  (alpha-engine-saturday) runs MorningEnrich / DataPhase1 / RAGIngestion
  through the new transport. If any step fails for transport-shape
  reasons, recover via SF redrive (operator-launched, NOT cron) which
  invokes the same script.
- Port-22 inbound on sg-03cd3c4bd91e610b0 stays open until PR 5
  (post-soak revoke). Manual operator SSH via key file remains as
  break-glass only.
- PRs 3-4 will follow this pattern: alpha-engine-backtester
  spot_backtest.sh + alpha-engine-predictor spot_train.sh
  (predictor's existing inline run_ssm bash helper is what the arc
  exists to replace at the chokepoint level).

Composes with morning-signal #34 (lib chokepoint adoption precedent),
alpha-engine-lib v0.35.0 (ssm_dispatcher module), and
[[feedback_lift_invariants_to_chokepoint_after_second_recurrence]]
(this is the second adopter — predictor was first, backtester will be
third).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant