Skip to content

fix(features): label technical features as polygon (not yfinance)#157

Closed
cipher813 wants to merge 1 commit into
mainfrom
fix/feature-registry-polygon-source
Closed

fix(features): label technical features as polygon (not yfinance)#157
cipher813 wants to merge 1 commit into
mainfrom
fix/feature-registry-polygon-source

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

The Feature Catalog page on the dashboard was showing 32 technical features as source="yfinance" — the pre-T+1 label. Per the 2026-04-24 polygon migration (PRs #90/#91 split EOD/morning by source), the morning enrichment overwrites the prior-day yfinance close with polygon grouped-daily (+ VWAP) at ~5:30 AM PT every weekday. Polygon is the stable canonical source for daily OHLCV-derived features; yfinance is the same-day EOD fallback.

What changed

Flips 32 technical entries (original 26 + 6 v3.1 additions) source="yfinance"source="polygon".

Also widens the FeatureEntry.source docstring comment to include polygon.

What's intentionally unchanged

  • Macro features (vix_level, yield_10y, yield_curve_slope, gold_mom_5d, oil_mom_5d, vix_term_slope) — daily_closes.py:61 notes both yfinance (^VIX, ^TNX) and FRED (VIXCLS, DGS10) publish these. The macro collector uses FRED as canonical with yfinance for index ETFs (GLD, USO). Worth a follow-up labeling pass to split FRED-canonical vs ETF-yfinance.
  • Alternative options features (put_call_ratio, iv_rank, iv_vs_rv) — yfinance is correct (polygon free tier doesn't expose options chains).

Downstream effect

The dashboard /Feature_Store page reads features/registry.json from S3, which is regenerated weekly by upload_registry(). Once the next weekly run lands, the catalog will show the corrected labels — no dashboard change needed.

Test plan

  • Registry parses and generate_registry_json() still produces valid JSON
  • Next weekly run uploads the updated registry; Feature Store catalog displays "polygon" for technical features

The Feature Catalog page on the dashboard was showing 32 technical
features as `source="yfinance"`, which is the pre-T+1 label. Per the
2026-04-24 polygon migration (data PRs #90/#91 split EOD/morning by
source), the morning enrichment overwrites the prior-day yfinance
close with the polygon grouped-daily price (with VWAP) every weekday
at ~5:30 AM PT. The stable value the predictor consumes for daily
OHLCV-derived features is polygon, with yfinance as the same-day EOD
fallback.

Flips 32 technical entries (the original 26 + 6 v3.1 additions) from
yfinance → polygon. Macro and alternative entries are intentionally
unchanged for now:

  • Macro (vix_level, yield_10y, yield_curve_slope, gold_mom_5d,
    oil_mom_5d, vix_term_slope) — daily_closes.py:61 notes both
    yfinance (^VIX, ^TNX) and FRED (VIXCLS, DGS10) publish these.
    Per the macro collector, FRED is the canonical path with a
    yfinance fallback for index ETFs (GLD, USO). Worth a follow-up
    pass to label by canonical FRED source vs ETF-yfinance source.
  • Alternative options features (put_call_ratio, iv_rank, iv_vs_rv)
    — yfinance is correct (polygon free tier doesn't expose options
    chains).

The dashboard `/Feature_Store` page will pick up the corrected labels
on the next weekly registry regeneration via
`features/registry.py:upload_registry()`.

Also widens the FeatureEntry source docstring comment from
`yfinance | fmp | computed` to `polygon | yfinance | fmp | computed`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813
Copy link
Copy Markdown
Owner Author

Closing without merging. Per Decision 11 (source from runtime measurement, not static intent labels): the source field in registry.py is conceptually wrong as a per-feature static descriptor — it can drift from runtime ingestion reality (e.g., when polygon 403's and yfinance fills as fallback, the static label still says polygon but the data is yfinance).

Replacing-the-stale-label-with-another-stale-label doesn't fix the underlying issue. The right move is upstream-side: drop the static source column from the dashboard's catalog rendering and add a runtime-attribution panel that reads data_manifest/*.json (already carries per-source ingestion counts) for the latest-snapshot truth. That work lives in alpha-engine-dashboard, not here.

Branch fix/feature-registry-polygon-source stays around if we ever decide to keep source as an explicit "canonical/expected source" label alongside runtime attribution.

@cipher813 cipher813 closed this May 5, 2026
@cipher813 cipher813 deleted the fix/feature-registry-polygon-source branch May 18, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant