PR1b: feature compute price+macro → ArcticDB (universe+macro libs, slim fallback, parity)#268
Merged
Merged
Conversation
… parity) PR1b of the Wave-4 predictor/price_cache_slim deletion arc — the riskier consumer (feeds the ENTIRE feature-compute pipeline + _extract_macro). Consumes lib v0.20.0 (PR0a-2: load_macro_series). features/compute._load_prices_and_macro's price source moves from a single load_slim_cache read to a composed ArcticDB read via the new _load_price_source() helper: - universe lib (load_universe_ohlcv) -> equities + SPY - macro lib (load_macro_series) -> VIX/VIX3M/TNX/IRX/GLD/USO + XL* sector ETFs (discovered via open_macro_lib().list_symbols(), filtered startswith XL; the heterogeneous non-price 'features' key is excluded by the explicit-symbols contract) - union = the slim-cache equivalent that _extract_macro + the feature pipeline consume unchanged. slim cache RETAINED as a whole-set fallback — feature compute cannot run blind (returns None only if BOTH sources fail -> empty, preserving the existing no-data contract). SOTA observation: while both exist, every run dual-reads + emits reconcile as_metrics() JSON (grep WAVE4_PARITY_METRIC compute). require_ticker_match=False — slim legitimately carries symbols the universe lib does not, so set asymmetry is logged for visibility while passed reflects value fidelity over the overlap. Dual-read + slim removed in PR4. Pin bump 0.19.0 -> 0.20.0 (requirements.txt + Dockerfile lockstep). No Dockerfile-extra change needed: features/compute.py is NOT Lambda-packaged (Dockerfile does not COPY features/); runs only via weekly_collector on EC2 spot, whose requirements already carry the [arcticdb] extra. +5 tests (compose / slim fallback / parity emit / both-fail / empty); full data suite 1380 passing; Wave-4 anti-drift guard still holds (compute.py keeps slim fallback -> stays in WAVE4_INVENTORY until PR4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The riskiest consumer migration of the Wave-4 arc (
predictor/price_cache_slim/deletion). Consumes libv0.20.0(PR0a-2 lib #51, merged:load_macro_series+ shared read core).Why this one is risky
features/compute._load_prices_and_macrofeeds the entire feature-compute pipeline and_extract_macro. The slim cache historically carried equities + SPY + index/macro series (VIX/VIX3M/TNX/IRX/GLD/USO) + allXL*sector ETFs in one flat dict. Those tenants are split across two ArcticDB libs — a naiveload_universe_ohlcvswap would silently drop every macro/ETF series →_extract_macronear-empty → SPY/VIX/VIX3M-derived features degrade across the whole universe. (This is the audit finding that justified the separate PR + the PR0a-2load_macro_serieshelper.)Change — composed read
New
_load_price_source():load_universe_ohlcv(bucket)→ equities + SPYload_macro_series(bucket, syms)→_MACRO_SLIM_KEYS∪XL*(discovered viaopen_macro_lib().list_symbols(),startswith("XL"); the heterogeneous non-pricefeatureskey is excluded byload_macro_series's explicit-symbols contract)_extract_macro+ the pipeline consume unchangedNone(→ empty, existing no-data contract) only if both sources failreconcile(...).as_metrics()JSON — grepWAVE4_PARITY_METRIC compute.require_ticker_match=Falsebecause slim legitimately carries symbols the universe lib does not — set asymmetry is logged in the metric fields for visibility whilepassedreflects value fidelity over the overlap. Dual-read + slim removed in PR4.Safety
features/compute.pyis not Lambda-packaged — the Dockerfile doesn'tCOPY features/; it runs only viaweekly_collectoron EC2 spot, whoserequirements.txtalready carries[arcticdb,flow_doctor,rag].0.19.0 → 0.20.0(requirements.txt + Dockerfile, lockstep guard enforced).WAVE4_INVENTORYuntil PR4).Sequence
This completes the PR1 (consumer migration) tier on the data side — both data-repo data-read consumers (
macro.pyPR1a #267 merged;features/compute.pyhere) now ArcticDB-primary + parity-observed. Next: PR2 (backtesterexit_timing.py), PR3 (dashboardhealth_checker.pyretire), PR4 (delete writer+prefix, gated on one clean Saturday-SF zero-diffParityReportacross theWAVE4_PARITY_METRICstreams).🤖 Generated with Claude Code