ERCOT Grid Stress & Price Spike Prediction System

822 hours of $100+ electricity prices that the grid said were fine. This system catches them.

ERCOT's Physical Responsive Capability (PRC) measures system-wide reserve levels — but West Texas (HB_WEST) prices spike from local congestion even when reserves are comfortable. A 200 MW Bitcoin miner paying $100+/MWh during those hours loses $61M over 5 years. This system predicts both grid-wide stress and local price spikes, triggering curtailment before the cost hits.

Runs end-to-end on Databricks: medallion architecture, 97 features, models versioned in MLflow, scored predictions written to Delta.

The Edge

Most approaches predict either grid reserves or electricity price. Neither alone works:

PRC-only catches system-wide emergencies (Uri, statewide heat) but misses congestion. During our 17-month test window, PRC-only saved $973K (5.4% of oracle). The price model saved $18.0M (99.7%). Most high-price hours are driven by local congestion, not system-wide scarcity. PRC is the safety net for Uri-type events; price is the daily P&L driver.
Price-only catches congestion but can't see system-wide emergencies 24h ahead.

This system runs both in parallel. Either signal triggers curtailment. In backtest: 99.7% of perfect-foresight savings within this test window for a 200 MW mining operation.

How It Makes Money

Economic Backtest (Jul 2024 – Dec 2025, 12,057 hours)

Mining (200 MW, $40/MWh revenue baseline):

Strategy	What it is	Savings	% Oracle
DAM-only @$40	Read DAM prices, curtail when high	$14.3M	79.2%
RT lag-react @$40	Watch last hour's price, react	$15.8M	87.4%
Combined system @$40	Our 1h PRC + price models	$18.0M	99.7%
Oracle	Perfect foresight	$18.1M	100%

Datacenter (200 MW, 65% critical / 35% flexible, $50 penalty):

Strategy	Savings	% Oracle
DAM-only @$60	$1.6M	65.5%
RT lag-react @$60	$1.9M	79.3%
Combined system @$60	$2.3M	96.1%
Oracle @>$143	$2.4M	100%

Incremental value vs what operators already do:

vs DAM-only: +$3.7M over 17 months (26% improvement)
vs reacting to last hour's RT price: +$2.2M (14% improvement)
Combined system has 91 false positive hours ($28K lost) vs 911 false positives for DAM-only ($2.0M lost) — 10x fewer bad calls

Assumes signal received at hour start, curtailment executable within 15 minutes, no ramp constraints modeled.

Winter Storm Uri — Unseen Event Detection

Models trained without Feb 2021 (removed from training data), tested on Uri:

PRC model: 100% scarcity recall — caught all 72 hours of grid emergency
Price model: 99.6% recall at $40 — missed 1 of 224 expensive hours
Combined: ~$200M saved for a 200 MW miner in 16 days (400:1 savings-to-cost ratio)

The models detected an event type they had never seen. Regression + thresholds extrapolates from continuous features — classification fails on novel extremes (the classifier scored 0/72).

System Architecture

1h LAYER — DECISION (triggers curtailment)
  ├─ PRC model:   pred < 3,000 MW → CURTAIL  |  pred < 5,000 MW → REDUCE
  └─ Price model: pred > $40/MWh  → REDUCE (mining)
                  pred > $50/MWh  → REDUCE (datacenter)
      ↳ Either signal fires → take the more aggressive action

4CP LAYER — TRANSMISSION (June-Sept, avoids annual charges)
  └─ 4CP model:   P(peak day) > threshold → CURTAIL afternoon window
      ↳ Reduces curtailment from ~50 afternoons to ~10 (work in progress)

24h LAYER — ADVISORY (informs scheduling, no auto-curtailment)
  ├─ PRC model:         next-day reserve risk
  ├─ RT-DAM spread:     price deviation early warning
  └─ DAM price (free):  strongest 24h input, no model needed

The 1h layer drives real-time curtailment. The 4CP layer drives summer transmission avoidance. The 24h layer informs planning.

Model Performance

Model	Type	Key Metric	Role
1h PRC	LR + LGBM ensemble	538 MW MAE, 100% Uri recall	Safety net for grid emergencies
1h Price	LGBM regressor	88% recall @ $100, 89% precision	Primary economic driver
4CP Risk	LGBM classifier	Day-level peak risk (in progress)	Transmission charge avoidance
24h PRC	Linear Regression	663 MW MAE	Next-day reserve planning
24h Spread	LGBM regressor	Predicts RT-DAM deviation	Early warning (advisory)

Post-RTC+B validation (Jan–Apr 2026): Price model holds strong (92.7% recall at $100) on new market structure. PRC degraded (877 vs 538 MW MAE) — retrain scheduled, market redesign shifted reserve dynamics.

Use Cases

Bitcoin Mining (200 MW, binary on/off, $40/MWh revenue): The 1h price model is the primary real-time decision (+$3.7M vs DAM-only baseline). 4CP avoidance (~$9M/year potential) is in development — 12 training examples makes reliable ML prediction challenging; conservative rule-based approach currently more cost-effective.

AI Datacenter (200 MW, 65% critical / 35% flexible, $50/MWh SLA penalty): Two-layer workflow — read DAM prices + 24h PRC to pre-migrate flexible workloads overnight, then let the 1h layer handle real-time surprises. +$745K vs DAM-only over 17 months.

Battery Storage (planned): Not curtailment — arbitrage. Buy low, sell high, bid ancillary services. Same model inputs, different decision layer (state-of-charge optimization).

Key Design Decisions

Regression > Classification for rare events. Regression extrapolates from continuous features (PRC dropping → predict lower). Classification needs to have seen the pattern. Validated via Uri holdout: regression caught 72/72 scarcity hours; classifier caught 0/72.

24h model predicts RT-DAM spread, not absolute price. DAM clears day-ahead — operators already know it. The question is "will RT exceed DAM?" not "what will RT be?" Reframing tripled recall (34.9% vs 11.5% at $100).

log1p target transform. RT prices range -$39 to $5,000. Without compression, the model overfits to Uri-level outliers and underfits the $40–200 range where curtailment decisions happen.

Thresholds optimized per use case. Mining threshold = $40 (revenue breakeven). Datacenter threshold = $50 (SLA penalty breakeven). Different economics → different optimal points.

Data Pipeline

Six data sources, automated ingestion, 97 engineered features:

ERCOT MIS (CSV)     → load, wind, solar, outages, RT/DAM prices (2021–2026)
GridStatus.io API   → PRC, ORDC price adders (post-RTC+B, Dec 2025+)
gridstatus (open)   → DAM system price for congestion spread
Open-Meteo API      → weather (temperature, humidity, wind, precipitation)
EIA API             → Waha Hub natural gas daily spot price
ERCOT xlsx archives → pre-RTC+B PRC + ORDC adders (2021–Nov 2025)

Databricks Deployment

The full pipeline runs on Databricks with medallion architecture:

BRONZE (raw ingestion)
  └─ CSV/xlsx/parquet files → Delta tables via Unity Catalog

SILVER (cleaning + transformation)
  ├─ 01_bronze_to_silver_ingest    → raw files to Delta tables
  └─ 02_silver_cleaning_backfill   → timestamp alignment, HB_WEST filtering,
                                      hour-ending fixes, ORDC/PRC extraction,
                                      weather API pull, deduplication

GOLD (ML-ready)
  ├─ 03_gold_merge                 → join all silver tables on timestamp
  ├─ 04_gold_features              → 97 features (lags, rolling stats, spreads,
  │                                   regime labels, 4CP indicators)
  └─ ercot.gold.model_ready        → final Delta table for training/scoring

MODELS (MLflow Registry — 5 production models)
  ├─ 05_model_training             → train all models, log to MLflow
  │   ├─ ercot-prc-1h-lr           → 1h PRC Linear Regression
  │   ├─ ercot-prc-1h-lgbm        → 1h PRC LGBM residual
  │   ├─ ercot-prc-24h            → 24h PRC Linear Regression
  │   ├─ ercot-price-1h           → 1h Price LGBM
  │   └─ ercot-spread-24h         → 24h RT-DAM Spread LGBM
  ├─ 06_backtest                   → realistic baseline comparison
  └─ 07_scoring                    → production scoring pipeline
        └─ ercot.gold.predictions  → scored output Delta table

PLANNED: 4CP peak prediction (not yet production-ready — 12 training examples,
         settlement-adjusted load not reproducible in real-time)

Models are versioned in MLflow with parameters, metrics, and artifacts tracked per run. Scoring loads models from the registry and writes predictions back to Delta.

How to Run

On Databricks: Run notebooks 01→07 in sequence, or schedule as a Workflow.

Locally:

python -c "from src.data.gridstatus_ingest import pull_new_data; pull_new_data('2026-04-01', '2026-04-07')"
python src/data/preprocess.py
python src/features/feature_engineering.py
python -m src.models.predict

Requires .env with GRIDSTATUS_API_KEY and EIA_API_KEY.

Roadmap

Tech Stack

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
images		images
models		models
notebooks		notebooks
src		src
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ERCOT Grid Stress & Price Spike Prediction System

The Edge

How It Makes Money

Economic Backtest (Jul 2024 – Dec 2025, 12,057 hours)

Winter Storm Uri — Unseen Event Detection

System Architecture

Model Performance

Use Cases

Key Design Decisions

Data Pipeline

Databricks Deployment

How to Run

Roadmap

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ERCOT Grid Stress & Price Spike Prediction System

The Edge

How It Makes Money

Economic Backtest (Jul 2024 – Dec 2025, 12,057 hours)

Winter Storm Uri — Unseen Event Detection

System Architecture

Model Performance

Use Cases

Key Design Decisions

Data Pipeline

Databricks Deployment

How to Run

Roadmap

Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages