feat(research): wire factor-profiles producer into the graph (un-orphan compute_and_write_factor_profiles) by cipher813 · Pull Request #203 · cipher813/alpha-engine-research

cipher813 · 2026-05-18T23:31:29Z

What

Splice a new compute_factor_profiles_node into graph/research_graph.py so the previously-orphaned producer scoring.factor_scoring.compute_and_write_factor_profiles (zero production callers, test-only → s3://alpha-engine-research/factors/ empty in prod) actually runs every Saturday SF research run.

Splice point + why

fetch_data → load_regime_substrate_node → macro_economist_node
  → compute_factor_profiles_node → compute_focus_list_node → dispatch_sectors_and_exit → … → score_aggregator

Spliced on the macro_economist_node → compute_focus_list_node edge (one re-route: that edge becomes macro → compute_factor_profiles_node + new compute_factor_profiles_node → compute_focus_list_node):

The producer needs sector_map + run_date, both populated in fetch_data and not mutated by load_regime_substrate_node / macro_economist_node.
It must write factors/profiles/{run_date}/by_ticker.json + latest.json before both consumers' existing read_factor_profiles_from_s3(): compute_focus_list_node (~~:1198) and score_aggregator (~~:1322, downstream of the dispatch that hangs off compute_focus_list_node). This one splice satisfies both.
The fetch_data → load_regime_substrate_node edge was not used because the Stage-C serial chain is a pinned topology invariant (tests/test_regime_stage_b_graph_topology.py). The macro→focus-list edge is the cleanest splice preserving every pinned edge and the regime / macro / focus-list / dispatch chain + conditional edges.

No SF/infra change required — it is a plain in-graph serial node.

Graceful-degrade

The node wraps the producer so any failure (missing run_date, missing/short features/{run_date}/*.parquet, S3 error, compute exception) is caught, logged flow-doctor-visibly (logger.warning/logger.error), and the node returns cleanly ({"factor_profiles_written": False, "factor_profiles_s3_key": ""}). The graph continues; the consumers degrade exactly as they do today when the substrate is absent (they already if not factor_profiles: skip) — no worse than the prior orphaned state. The weekly research run is never hard-failed on this new dependency. Profiles are not threaded through state — consumers read from S3 by design; only a small observability delta flows.

Behavior-safety

No flag is flipped. config.FACTOR_BLEND_ENABLED and config.FOCUS_LIST_GATING_ENABLED stay default-false. This is substrate-only: it makes the factor substrate exist/ready and lets the focus-list shadow audit populate scanner_evaluations.focus_*. No scoring or agent behavior changes.

Closes / unblocks

Closes ROADMAP P1 "Wire the orphaned factor-profiles producer into the Saturday SF".
Unblocks the FOCUS_LIST P0's real gate — its shadow audit now sees a populated factor substrate each run.

Tests

tests/test_factor_profiles_node.py (new): (a) node calls compute_and_write_factor_profiles with the state's run_date + sector_map and returns the observability delta on success; (b) producer exception / missing run_date → node logs + returns cleanly, no raise; (c) static-AST graph-wiring assertions (mirroring test_regime_stage_b_graph_topology.py) that the node is registered, runs after fetch_data via the macro chain, and strictly before compute_focus_list_node AND score_aggregator, without altering the sector dispatch.
tests/test_dry_run.py: fixed a pre-existing order-dependent test-isolation bug surfaced (not caused) by the new test file. TestGraphModuleGuard.test_skips_late_bound_patches_when_graph_absent and TestInstallRestore setup/teardown evicted/replaced the real sys.modules["graph.research_graph"] without restoring it; a later re-import created a second module object so other test modules' collection-time-bound _build_signals_payload no longer saw their monkeypatch.setattr("graph.research_graph.<FLAG>", …) (the exact leak the test_regime_stage_b_graph_topology.py docstring documents and previously only "sidestepped" via filename ordering). Both sites now snapshot + restore the real module.

Full suite: 1366 passed (~/Development/alpha-engine-research/.venv/bin/python -m pytest -q).

🤖 Generated with Claude Code

…an compute_and_write_factor_profiles) Splice a new `compute_factor_profiles_node` into graph/research_graph.py between `macro_economist_node` and `compute_focus_list_node`: fetch_data → load_regime_substrate_node → macro_economist_node → compute_factor_profiles_node → compute_focus_list_node → dispatch … Splice point + why: - The producer `scoring.factor_scoring.compute_and_write_factor_profiles` needs `sector_map` + `run_date`, both populated in `fetch_data` and NOT mutated by load_regime_substrate_node or macro_economist_node. - It must land `factors/profiles/{run_date}/by_ticker.json` + `latest.json` in S3 BEFORE both consumers do their existing `read_factor_profiles_from_s3()`: `compute_focus_list_node` (~:1198) and `score_aggregator` (~:1322, downstream of the dispatch off compute_focus_list_node). Splicing on the macro→compute_focus_list_node edge satisfies both with one re-route. - This edge was chosen over `fetch_data → load_regime_substrate_node` because the Stage-C serial chain (fetch_data → substrate loader → macro) is a pinned topology invariant (tests/test_regime_stage_b_graph_topology.py); the macro→focus-list edge is the cleanest splice that preserves every pinned edge and the regime / macro / focus-list / dispatch chain + conditional edges. Graceful-degrade: the node wraps the producer so ANY failure (missing run_date, missing/short `features/{run_date}/*.parquet`, S3 error, compute exception) is caught, logged flow-doctor-visibly (warning/error), and the node returns cleanly ({"factor_profiles_written": False, "factor_profiles_s3_key": ""}) so the graph continues. The consumers then degrade exactly as they do today when the substrate is absent (they already `if not factor_profiles: skip`) — i.e. no worse than the prior orphaned state. The weekly research run is never hard-failed on this new dependency. Profiles are NOT threaded through state — consumers read from S3 by design; only a small observability delta flows (`factor_profiles_written` / `factor_profiles_s3_key`). Behavior-safety: NO flag is flipped. `config.FACTOR_BLEND_ENABLED` and `config.FOCUS_LIST_GATING_ENABLED` stay default-false — this is substrate-only: it makes `s3://alpha-engine-research/factors/` exist (it is empty in prod today since the producer was orphaned / test-only) and lets the focus-list shadow audit populate `scanner_evaluations.focus_*`. No scoring/agent behavior changes. Closes ROADMAP P1 "Wire the orphaned factor-profiles producer into the Saturday SF" and unblocks the FOCUS_LIST P0's real gate (its shadow audit now sees a populated factor substrate each run). Tests: - tests/test_factor_profiles_node.py (new): (a) node calls compute_and_write_factor_profiles with the state's run_date + sector_map and returns the observability delta on success; (b) producer exception / missing run_date → node logs + returns cleanly, no raise (graph continues); (c) static-AST graph-wiring assertions (mirroring test_regime_stage_b_graph_topology.py) that the node is registered, runs after fetch_data via the macro chain, and strictly before compute_focus_list_node AND score_aggregator, without altering the sector dispatch. - tests/test_dry_run.py: fixed a pre-existing order-dependent test-isolation bug surfaced (not caused) by the new test file. `TestGraphModuleGuard.test_skips_late_bound_patches_when_graph_absent` and `TestInstallRestore` setup/teardown evicted/replaced the real `sys.modules["graph.research_graph"]` WITHOUT restoring it; a later re-import created a second module object so other test modules' collection-time-bound `_build_signals_payload` no longer saw their `monkeypatch.setattr("graph.research_graph.<FLAG>", …)` (the leak the test_regime_stage_b_graph_topology.py docstring documents). Now both snapshot + restore the real module. Full suite: 1366 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ollow-up to #203) (#204) #203 wired the producer with graceful-degrade (catch/log/continue). Per Brian + feedback_no_silent_fails: that recreates the exact orphaned-producer silent-failure class this wiring exists to fix — a failing producer would log a warning nobody reads while focus-list + factor-blend silently go inert again. compute_factor_profiles_node now RAISES on any failure (missing run_date, producer exception) → the Research SF state fails loudly + alerts. Not spuriously fragile: features/{run_date}/*.parquet is produced by DataPhase1 UPSTREAM in the same Saturday SF, so its absence is already an incident (DataPhase1 should have failed) — this surfaces real breakage, never fails a healthy run. Matches the system's fail-loud norm (DataPhase2 populated-ratio gate; optimizer PR5 empty-order-book-not-legacy-fallback). Still substrate-only — no flag flipped, no scoring change. Docstring + 2 tests flipped graceful-return → pytest.raises. Suite 1366 passed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cipher813 merged commit b4f60f3 into main May 18, 2026
1 check passed

cipher813 deleted the feat/wire-factor-profiles-producer branch May 18, 2026 23:35

cipher813 mentioned this pull request May 18, 2026

fix(research): factor-profiles node HARD-FAILS — no silent degrade (follow-up to #203) #204

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(research): wire factor-profiles producer into the graph (un-orphan compute_and_write_factor_profiles)#203

feat(research): wire factor-profiles producer into the graph (un-orphan compute_and_write_factor_profiles)#203
cipher813 merged 1 commit into
mainfrom
feat/wire-factor-profiles-producer

cipher813 commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cipher813 commented May 18, 2026

What

Splice point + why

Graceful-degrade

Behavior-safety

Closes / unblocks

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant