Fetch macro tickers daily via polygon → FRED → yfinance#34
Merged
Conversation
The daily pipeline was never wired to update the ArcticDB macro library. macro_keys in builders/daily_append.py Section 5 always hit closes.get(key) is None, since _run_daily() passed only S&P constituents to daily_closes.collect(). Macro series (SPY/VIX/TNX/IRX/GLD/USO + XL* sector ETFs) only refreshed on Saturday Phase 1, which left them 4–10+ days stale by mid-week and tripped PredictorPreflight on 2026-04-15. Changes: - weekly_collector._run_daily: extend tickers with MACRO_DAILY_TICKERS (ETFs + ^-prefix indices) before daily_closes.collect. - collectors/daily_closes: insert FRED fallback between polygon and yfinance for the 4 index tickers not on polygon free tier (VIX, VIX3M, TNX, IRX). FRED series VIXCLS/VXVCLS/DGS10/DTB3 publish in the same scale as yfinance ^-prefix tickers (raw level for VIX/VIX3M, percent for TNX/IRX), so downstream feature scaling is unchanged. Source priority per ticker: polygon grouped-daily (ETFs) → FRED (indices) → yfinance (whatever remains). Each source logs a miss count so failures are observable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
May 9, 2026
Saturday SF DataPhase1 PARTIAL run on 2026-05-09 fired no per-failure Flow Doctor alert. The arcticdb backfill regression was stored in the result dict but never logged at ERROR level — only main()'s generic "Weekly collection finished with non-ok status=partial" summary fires, which produces a single dedup signature across every partial run and contains no actual error text for Flow Doctor's LLM diagnose pipeline. Fix: _finalize() now calls alpha_engine_lib.collector_results.report_collector_errors(), which emits one logger.error() per error-status entry with the collector name + original message. Each emitted record carries a distinct dedup signature, restoring per-failure alert granularity. - Pin alpha-engine-lib v0.5.1 → v0.6.2 (helper landed in lib PR #34) - Wire call inside _finalize after status computation, before manifest write — fires for every code path that finalizes (phase 1, phase 2, daily, morning enrich) and runs even if postflight raises afterward - New wiring test test_collector_error_visibility.py pins the call so a future refactor can't silently drop it 552 unit tests pass locally. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8 tasks
cipher813
added a commit
that referenced
this pull request
May 27, 2026
…58) (#327) Closes the mocked-test scope-shape gap for the three external services alpha-engine-data depends on. Mirrors the morning-signal #34 pattern shipped 2026-05-26 for the same class of bug: unit tests mock the external client, so payload-shape drift (field renames, schema deprecations, status-code semantics) is invisible to CI until production fires. Each smoke is its own paths-filtered workflow + skip-on-no-credentials script, so PRs that don't touch the relevant module skip the workflow entirely and forks without secrets get a clean skip rather than a failing CI status. Smokes: - Polygon: get_grouped_daily for the most recent US weekday; asserts every bar carries the {open, high, low, close, volume, vwap} keys the consumer reads. ~$0.01/run, gated on POLYGON_API_KEY. - FRED: fetch_fred_history("DGS2", period_years=1); asserts >=50 observations and "value" column present. Free tier, gated on FRED_API_KEY. - ArcticDB: read tail of SPY from universe library; asserts the canonical OHLCV_COLS + PROVENANCE_COL schema. Read-only, no writes. Gated on OIDC role assumption (github-actions-lambda-deploy). IAM grant: adds two scoped Statements to infrastructure/iam/github-actions-lambda-deploy.json: - ArcticDBSmokeReadObject: s3:GetObject on arcticdb/* (read-only) - ArcticDBSmokeListBucket: s3:ListBucket with prefix condition Operator-step on merge: `./infrastructure/iam/apply.sh github-actions-lambda-deploy` to push the new policy to AWS. The IAM drift check will fail until that runs. Secrets to add in GHA repo settings: POLYGON_API_KEY, FRED_API_KEY. Composes with morning-signal #34, alpha-engine-lib #78 (anthropic_payload chokepoint), and the L258 P0-retrospective entry in ROADMAP. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
cipher813
added a commit
that referenced
this pull request
May 27, 2026
…342 PR 2) (#330) Migrates infrastructure/spot_data_weekly.sh off SSH+SCP onto the alpha-engine-lib v0.35.0+ `ssm_dispatcher` chokepoint (`python -m alpha_engine_lib.ssm_dispatcher run`). Closes the (i) alive-SSH-path finding from the 2026-05-24 audit; PR 2 of the 5-PR ROADMAP L342 arc. Transport changes: - Wait-for-SSH loop → wait-for-SSM-Online (`aws ssm describe-instance- information` polling, 180s budget, mirrors predictor #168 pattern) - `run_remote "..."` (ssh-based) → `run_ssm "<desc>" <timeout> <<HEREDOC` (lib CLI via --script-stdin) - SCP config upload → S3 staging: dispatcher uploads alpha-engine-config/data/config.yaml to tmp/spot_data_weekly/<run_id>/config.yaml; spot pulls via existing alpha-engine-executor-profile IAM role's s3:GetObject grant - REMOTE_PYTHON captured via SSH → PYTHON_BIN resolved inline per SSM step (`command -v python3.12 || command -v python3`) - KEY_FILE / SSH_OPTS removed; KEY_NAME kept ONLY as launch attribute for alpha_engine_lib.ec2_spot's --key-name flag (break-glass operator SSH only — port-22 SG revoke is PR 5 of the arc) Why pipe heredoc via --script-stdin instead of mirror predictor's inline `"$(cat <<HEREDOC ... HEREDOC)"` pattern: the data path's RAG-secrets block contains `aws ssm get-parameter --query 'Parameter.Value' ...` inside `$(...)`. The outer command-substitution scanner sees the inner single quotes and breaks. The lib CLI's --script-stdin flag reads the body verbatim, so the dispatcher's bash parser never scans the inner script for quote/paren balance. Future PRs adopting the lib chokepoint (ssm_dispatcher) should prefer --script-stdin for any non-trivial spot-side script body. Cleanup: dispatcher trap also removes the S3 staging prefix on EXIT (belt-and-suspenders — S3 lifecycle on tmp/ is the authoritative purger). CI guards (tests/test_spot_data_weekly_ssm_transport.py, 8 new tests): - test_spot_data_weekly_script_exists — script presence - test_no_top_level_ssh_invocation — no `ssh -X` / `ssh ` outside comments - test_no_top_level_scp_invocation — no `scp -X` / `scp ` outside comments - test_no_ssh_keyscan_invocation — no `ssh-keyscan github.com` re-introduce - test_uses_lib_ssm_dispatcher_chokepoint — `alpha_engine_lib.ssm_dispatcher` present (catches a regression that replaces it with `aws ssm send-command`) - test_no_inline_aws_ssm_send_command — no direct `aws ssm send-command` (the predictor #168 pre-lift pattern L342 explicitly lifts to lib) - test_stages_config_via_s3 — `aws s3 cp ... config.yaml` present - test_no_residual_key_file_dispatch_use — no $KEY_FILE / $SSH_OPTS in non-comment lines (KEY_NAME stays as launch attribute, allow-listed) Test fixture updates (no behavior change): - test_spot_env_source_aws_region.py — multi-line `read -r -d '' ENV_SOURCE <<'ENV_EOF' ... ENV_EOF` recognized in addition to the single-line `ENV_SOURCE="..."` shape. The semantic invariant (ENV_SOURCE exports AWS_REGION + AWS_DEFAULT_REGION) is unchanged. - test_preflight_only_dry_path.py — accept `run_ssm "workloads"` as the workloads opener in addition to the pre-migration `run_remote bash -s <<WORKLOADS`. Suite: 1618 → 1626 passed (+8). Operator notes: - The spot's IAM profile (alpha-engine-executor-profile) already grants AmazonSSMManagedInstanceCore via the predictor #168 migration (lib pin v0.35.0+ ships in same profile); no IAM changes needed here. - Saturday SF first exercise: next Saturday SF firing (alpha-engine-saturday) runs MorningEnrich / DataPhase1 / RAGIngestion through the new transport. If any step fails for transport-shape reasons, recover via SF redrive (operator-launched, NOT cron) which invokes the same script. - Port-22 inbound on sg-03cd3c4bd91e610b0 stays open until PR 5 (post-soak revoke). Manual operator SSH via key file remains as break-glass only. - PRs 3-4 will follow this pattern: alpha-engine-backtester spot_backtest.sh + alpha-engine-predictor spot_train.sh (predictor's existing inline run_ssm bash helper is what the arc exists to replace at the chokepoint level). Composes with morning-signal #34 (lib chokepoint adoption precedent), alpha-engine-lib v0.35.0 (ssm_dispatcher module), and [[feedback_lift_invariants_to_chokepoint_after_second_recurrence]] (this is the second adopter — predictor was first, backtester will be third). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs/incident-2026-04-15-macro-stale.md— daily pipeline was never wired to update the ArcticDB macro library_run_daily()now passes macro ETFs +^-prefix indices todaily_closes.collect();builders/daily_append.pySection 5 then writes them to the ArcticDB macro library every weekdayVIXCLS/VXVCLS/DGS10/DTB3publish in the same units as yfinance^-prefix tickers, so no downstream feature changesSource priority per ticker
Each source logs a captured/missed count so failures are observable.
Test plan
ae-dashboard "cd ~/alpha-engine-data && git pull"🤖 Generated with Claude Code