feat(cost-telemetry): runaway-cost circuit breaker on news pipeline (Phase 4 #1)#309
Merged
Merged
Conversation
…Phase 4 #1) Mirrors alpha-engine-research's ``llm_cost_tracker.RunBudgetExceededError`` pattern at the news-pipeline cost site (Phase 0.2 wiring, PR #308): ``S3CostBuffer.record()`` now tracks cumulative cost across the run and raises ``CostBudgetExceededError`` after recording the offending row if the per-run total exceeds ``ALPHA_ENGINE_RUN_BUDGET_USD`` (default $100, shared env var with research + executor — one operator knob ceilings cost across all SF entry points). **Failure shape:** row is recorded BEFORE the raise so per-call detail is preserved on S3 when the breaker fires. The pipeline-side try/finally in ``run_news_pipeline.py`` ensures the buffer flushes even when the breaker raises mid-loop — rows up to and including the breach call land on S3 so operators can diagnose what broke the budget without re-running. **Posture:** breaker propagates through the client proxy (``_CostTrackingMessages.create``) — generic record errors still get swallowed (event extraction's primary deliverable must survive a malformed-response hiccup), but ``CostBudgetExceededError`` is explicitly re-raised since swallowing it would defeat the safety net. **Operator-facing fields on the error:** ``run_id``, ``agent_id``, ``cumulative_cost_usd``, ``ceiling_usd``. Message tells operator how to adjust the env var if the ceiling needs raising. **Tests:** 4 ``TestRunBudgetCeilingResolution`` (default / env / zero disables / malformed-returns-zero) + 5 ``TestCostBudgetBreaker`` (under-ceiling no raise, breach raises after recording row, zero disables, proxy propagates, ceiling defaults from env). Suite 1484 → 1493 passing, zero regressions. **Composes with** PR #308 (the Phase 0.2 wiring) — the breaker is a small additive surface on the buffer; the production wiring path unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
cipher813
added a commit
that referenced
this pull request
May 25, 2026
…or (#310) Applies the standing rule per ``[[preference_llm_calls_confined_to_research_module]]`` — LLM calls live in alpha-engine-research. The news pipeline's Haiku-backed event extractor is removed and replaced with a deterministic classifier that uses two zero-cost signals already on the wire: 1. **Vendor tags** (``NewsArticle.tags``). Polygon emits keywords, GDELT emits structured event codes, Benzinga emits Channels. The ``alpha_engine_lib.sources.protocols.NewsArticle.tags`` docstring explicitly names this as "a soft signal for downstream event-flag extraction" — we were paying Haiku to re-derive what Polygon / GDELT already tagged. 2. **Title-keyword regex**. Backstop for sources that don't populate tags (Yahoo RSS). 17 pattern → category mappings against the ``DEFAULT_EVENT_CATEGORIES`` closed taxonomy. **Why this is the right answer, not a kill-switch:** code audit found the Haiku per-article structured output was aggregated to 5 scalar / list columns (``event_count``, ``event_severity_max/mean``, ``event_categories``, ``top_event_descriptions``) before any research consumer touched it. The "zero-shot novel-event detection" capability was mostly wasted — research only sees per-ticker rollups. Tag-based + keyword-based classification produces equivalent rollups deterministically. **Cost impact:** retires the largest previously-untracked LLM cost slice in the system per the original Phase 0 audit estimate ($20–60/mo). Actual spend on the deleted call site goes to $0; the research consumer sees identical EventFlag shape (extractor slug changes from ``"anthropic_haiku"`` to ``"rule_based"``) and identical aggregate columns in ``news_aggregates/{date}.parquet``. **Substrate cleanup:** retires three files added earlier this session: - ``collectors/nlp/event_extraction.py`` (the Anthropic extractor itself) - ``rag/pipelines/_cost_telemetry.py`` (Phase 0.2 cost-telemetry buffer, PR #308 + Phase 4 #1 runaway-cost breaker, PR #309 — both retired with the LLM call site they instrumented) - ``tests/test_news_cost_telemetry.py`` (mirrored tests) ``DEFAULT_EVENT_CATEGORIES`` moves into the new ``collectors/nlp/rule_based_event_extraction.py`` so the closed taxonomy stays accessible to downstream consumers. **Protocol contract:** ``EventExtractor.extract`` gains an optional ``article_tags: tuple[str, ...] = ()`` kwarg (back-compat default). The pipeline plumbs the tag union across article variants. Any future EventExtractor implementation (FinBERT, spaCy, reactivated LLM via research module) consumes the same shape. **Severity convention:** rule-based flags emit ``severity=0.5`` uniformly (the EventFlag protocol's documented default). The Haiku severity was a free-floating judgment never tuned by any operator alert. Per-category severity tuning can be added via YAML if a downstream surface needs it. **Tests:** ``TestRuleBasedEventExtractor`` (10 tests) covers empty-text short-circuit, no-match returns empty, title-keyword classification per category (earnings / M&A / FDA), tag-based classification (Polygon/GDELT shape), tag+title union, multi-category emission, deterministic ordering per ``DEFAULT_EVENT_CATEGORIES``, zero-LLM-dependency contract, title-as-description shape. Suite 1493 → 1479 net (retired the 9 cost-telemetry tests + 7 Anthropic extractor tests; added 10 rule-based tests). **Composes with:** - ``[[preference_llm_calls_confined_to_research_module]]`` — the rule this PR enforces - alpha-engine #212 (executor EOD narrative kill switch) — sibling application of the same rule. Two non-research LLM call sites; this PR retires data's entirely, executor's keeps the kill switch substrate (default off) since the LLM path may be operator-reactivated. - Retires the substrate from data #308 + data #309 (Phase 0.2 + Phase 4 #1 cost-telemetry buffer + breaker) — both became dead code with the LLM call site they instrumented. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Mirrors alpha-engine-research's `llm_cost_tracker.RunBudgetExceededError` pattern at the news-pipeline cost site (Phase 0.2 wiring from PR #308). `S3CostBuffer.record()` now tracks cumulative cost across the run and raises `CostBudgetExceededError` after recording the offending row when the per-run total exceeds `ALPHA_ENGINE_RUN_BUDGET_USD` (default $100, shared env var with research + executor — one operator knob across all SF entry points).
Failure shape
Row recorded BEFORE raise so per-call detail is preserved on S3 when the breaker fires. Pipeline-side `try/finally` in `run_news_pipeline.py` ensures the buffer flushes even when the breaker raises mid-loop — rows up to and including the breach call land on S3 so operators diagnose without re-running.
Posture
Breaker propagates through the client proxy (`_CostTrackingMessages.create`):
Test plan
Composes with PR #308 (Phase 0.2 wiring) — small additive surface on the buffer; production path unchanged.
🤖 Generated with Claude Code