Skip to content

RAG hardening — remove silent fails + add preflight#31

Merged
cipher813 merged 1 commit into
mainfrom
feat/rag-hardening
Apr 14, 2026
Merged

RAG hardening — remove silent fails + add preflight#31
cipher813 merged 1 commit into
mainfrom
feat/rag-hardening

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Applies the PR #24/#25/#28 pattern to the Saturday RAG ingestion path. The shell script was silently swallowing failures in all 5 pipelines and then sending a completion email with a hardcoded `status: ok` — meaning "RAG Weekly Ingestion Complete" was a lie whenever any step failed.

Changes

`rag/pipelines/run_weekly_ingestion.sh`

  • Removed `|| echo "WARNING: ... (non-fatal)"` from all 5 ingestion steps, the CloudWatch heartbeat, and the completion email. `set -e` was already active but these swallowers defeated it.
  • Removed the runtime `if [ -n "$FINNHUB_API_KEY" ]; ... else echo SKIPPED fi` branch. All required env vars are hard-failed by preflight before any ingestion runs.
  • Added `Step 0/5: python -m rag.preflight`.
  • The hardcoded `status: ok` completion email is now truthful rather than aspirational — with `set -e` active and no swallowers, reaching the email means all 5 pipelines actually succeeded.

`rag/preflight.py` (new)

  • `RAGPreflight(BasePreflight)` — composes `check_env_vars` (AWS_REGION, VOYAGE_API_KEY, FINNHUB_API_KEY, EDGAR_IDENTITY, RAG_DATABASE_URL) + `check_s3_bucket`.
  • `main()` uses alpha-engine-lib's `setup_logging` with the shared `flow-doctor.yaml` path, so a preflight failure fires email + issue via the existing dispatch.

`rag/db.py`

  • `is_available()`: `log.debug` → `log.warning` for the exception path.

`rag/pipelines/ingest_8k_filings.py`

  • Per-URL download failure: `log.debug` → `log.warning`. Aggregate counts are still reported upstream, so no behavior change — failures are just visible now.

Dead code flagged (no change in this PR)

  • `rag/db.py::is_available` — zero callers inside alpha-engine-data. Keep for now; defer deletion until cross-repo audit confirms no usage in predictor/research/backtester.

Out of scope (tracked)

  • Adopt alpha-engine-lib `setup_logging` in each ingestion script's `main()` for consistent log formatting + flow-doctor capture of per-pipeline errors. Currently only `preflight.py` uses the lib. Minor follow-up.
  • Date-parsing `except ValueError: continue` patterns in the ingest modules. Reviewed case-by-case — all are legitimate "skip this malformed entry" flows with aggregate counts upstream. Not silent fails.

Test plan

  • `pytest tests/ --ignore=tests/integration -q` — 41 pass
  • Syntax check on all modified Python files
  • `bash -n run_weekly_ingestion.sh`
  • Next Saturday Step Function run exercises the hardened path.
  • Forced failure test on EC2: unset `FINNHUB_API_KEY` → preflight (step 0) must hard-fail, not silently skip step 3.

🤖 Generated with Claude Code

Applies the PR #24/#25/#28 pattern to the Saturday RAG ingestion path.
The shell script was silently swallowing failures in all 5 pipelines,
making "RAG Weekly Ingestion Complete" a lie whenever any step failed.

Shell script (rag/pipelines/run_weekly_ingestion.sh)
- Removed `|| echo "WARNING: ... (non-fatal)"` from all 5 ingestion
  steps, the CloudWatch heartbeat, and the completion email. set -e
  was already active but these swallowers defeated it.
- Removed the runtime `if [ -n "$FINNHUB_API_KEY" ]; then ... else
  echo SKIPPED fi` branch. All required env vars are now hard-failed
  by preflight (step 0) before any ingestion runs. A silently-skipped
  earnings transcript step defeats the purpose of having transcripts
  at all.
- Added `Step 0/5: python -m rag.preflight` at the top.
- The hardcoded 'status: ok' completion email is now truthful rather
  than aspirational — with set -e active and no swallowers, reaching
  the email means all 5 pipelines actually succeeded.

New file: rag/preflight.py
- RAGPreflight(BasePreflight) subclass — composes check_env_vars
  (AWS_REGION, VOYAGE_API_KEY, FINNHUB_API_KEY, EDGAR_IDENTITY,
  RAG_DATABASE_URL) + check_s3_bucket.
- main() uses alpha-engine-lib's setup_logging with the shared
  flow-doctor.yaml path, so a preflight failure fires email + issue
  via the existing dispatch.

rag/db.py
- is_available(): log.debug → log.warning for the exception path.
  The function was otherwise unchanged — it's a non-raising probe for
  future retrieval-side consumers. Flagged as unused inside
  alpha-engine-data (zero callers); defer deletion until cross-repo
  audit completes, since predictor / research / backtester may import
  from it.

rag/pipelines/ingest_8k_filings.py
- Per-URL download failure: log.debug → log.warning. Caller still
  treats None as "skip this filing" (aggregate counts are reported),
  so no behavior change; the failure rate is just visible now.

Dead code flagged (no change in this PR)
- rag/db.py::is_available — zero local callers. Keep for now, flag
  for future cross-repo sweep.

Out of scope (tracked)
- Adopt alpha-engine-lib setup_logging in each ingestion script's
  main() for consistent log formatting + flow-doctor capture of
  per-pipeline errors. Currently only preflight.py uses the lib;
  ingestion scripts still use Python's default root logger. Minor
  follow-up.
- Date-parsing `except ValueError: continue` patterns in
  ingest_sec_filings, ingest_8k_filings, ingest_theses,
  ingest_earnings_transcripts. Reviewed case-by-case — all are
  legitimate "skip this malformed entry" flows with aggregate counts
  upstream. Not silent fails.

Test plan
- [x] pytest tests/ — 41 pass
- [x] Syntax check on all modified Python files
- [x] bash -n on run_weekly_ingestion.sh
- [ ] Next Saturday Step Function run exercises the hardened path.
  Forced failure test: unset FINNHUB_API_KEY on EC2 and re-run —
  must fail at preflight (step 0), not silently skip step 3.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit 43184c4 into main Apr 14, 2026
1 check passed
@cipher813 cipher813 deleted the feat/rag-hardening branch April 14, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant