Skip to content

mcmillinanalytics/drought-exposure-tool

Repository files navigation

U.S. Drought Exposure Monitor

County-level uninsured drought exposure map

A Streamlit dashboard that surfaces county-level uninsured drought exposure across U.S. row crops — built on USDA NASS planted acres, RMA Summary of Business insured acres, the U.S. Drought Monitor, and NOAA's CPC seasonal outlook.

Live: https://drought.mcmillinanalytics.com

The "uninsured exposure" score flags counties where current drought severity is high and federal crop-insurance coverage is thin — i.e., where loss exposure stays with the producer rather than shifting to the federal indemnity backstop.

Example. Parmer County, TX: 324K acres of wheat, currently in severe drought (USDM 3.3/5), CPC forecasts persistence through July, only 11% of those acres federally insured. That kind of concentration is sitting in public datasets but invisible until they're stitched together on a county FIPS backbone.

Crops tracked: corn, soybeans, wheat (winter + spring), cotton, grain sorghum.

Audience: ag bankers, crop-insurance underwriters, commodity desks, market-intelligence shops, and anyone who needs a county-level read on where weather risk is not hedged.

What you can look at

Drought severity vs insurance coverage gap, one marker per county

Drought × Insurance Gap scatter — one marker per county, sized by planted acres. Top-right is the watch zone: high drought severity, large coverage gap. Color encodes the composite Exposure score.

The dashboard is organized as View × Layer:

View Layer What's shown
Now Exposure Current drought severity vs insurance coverage by county. Where uninsured drought risk sits today.
Now Drought Severity Current USDM score (0 = none, 5 = exceptional). What's drying right now, regardless of coverage.
Forecast (3-mo) Exposure 3-month forecast severity vs current coverage. Where uninsured exposure is heading.
Forecast (3-mo) CPC Outlook NOAA CPC's categorical 3-month forecast: develops / persists / improves / removes / no drought.
vs Last Year Drought Change Current USDM minus same-week-last-year USDM. Red = drier than a year ago, blue = recovered.

Loss ratio (indemnity ÷ premium for the most recent settled crop year) is surfaced as a KPI and as a column in the watch list, but is not its own map layer — a single national context number is more useful than a diverging-at-1.0 choropleth.

Headline math

exposure                  = drought_severity × (1 − coverage)
coverage                  = clip(RMA insured acres ÷ NASS planted acres, 0, 1)
drought_severity_score    = (D0·1 + D1·2 + D2·3 + D3·4 + D4·5) / 100   # 0–5
loss_ratio                = indemnity ÷ premium                         # KPI + table only

Forecast severity blends current USDM with the CPC outlook's directional category:

Persistence / No drought  →  leave current value
Improvement               →  current − 1.0   (clipped at 0)
Removal                   →  0.0
Development               →  max(current, 2.0)

Forecast composite = forecast_severity × (1 − coverage). Counties without a CPC polygon assignment are gray ("no forecast available").

A county lands on the watch list when it has data from all sources and coverage < 0.70. The 70% threshold is set in pipeline/join.py:PENETRATION_THRESHOLD.

Coverage smoothing

RMA publishes the SoB Coverage file monthly, so the open crop year under-reports per-county insured acres until late in the season. When a county has 3+ years of historical penetration, we substitute the 5-year mean if the open-year observation is missing or falls > 10pp below that mean. The Advanced expander in the sidebar lets you switch between:

  • Adjusted (default) — observed unless the lag-detection rule triggers, then the 5-yr mean.
  • Observed — raw current year as published. Useful to see the lag.
  • Smoothed — always use the 5-yr mean when the county is stable (closed-year stdev < 5pp).

Sources

# Source Endpoint / file Used for
1 USDA NASS Quick Stats https://quickstats.nass.usda.gov/api/api_GET/ County-level AREA PLANTED by crop, by year (denominator for coverage).
2 USDA RMA Summary of Business — Coverage report https://www.rma.usda.gov/tools-reports/summary-of-business/state-county-crop-summary-business Insured acres, liability, premium, indemnity, loss ratio at state/county/crop grain. We use 6 years (2021–2026).
3 U.S. Drought Monitor https://usdmdataservices.unl.edu/api/CountyStatistics/GetDroughtSeverityStatisticsByAreaPercent Percent of county area in each drought category (D0–D4). We pull the most recent Tuesday and the same week one year ago.
4 NOAA CPC Seasonal Drought Outlook https://ftp.cpc.ncep.noaa.gov/GIS/droughtlook/sdo_polygons_latest.zip Categorical 3-month forecast issued the 3rd Thursday of each month. Polygon shapefile → county centroid point-in-polygon assignment.

All sources join on 5-digit county FIPS (state FIPS zero-padded to 2 + county FIPS zero-padded to 3).

References / methodological lineage

The framing borrows from the empirical literature on drought exposure and the role of federal crop insurance as a buffer between weather losses and producers:

  • Pogach, Polson, and Heil (FDIC, 2024). Drought Exposure and Agricultural Community Banks.
  • Rodziewicz and Dice (Federal Reserve Bank of Kansas City, 2020). Drought Risk to the Agricultural Sector.

This dashboard is not a replication of either; it borrows the "drought-severity × insurance-coverage" decomposition as a market- intelligence visualization rather than as an econometric model.

Setup

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements-dev.txt
copy .env.example .env
# edit .env and set NASS_API_KEY (free key at https://quickstats.nass.usda.gov/api)

Pre-fetch data (one-time, then re-run as needed)

python -m pipeline.nass
python -m pipeline.usdm
python -m pipeline.cpc
python -m pipeline.rma     # needs sobcov_YYYY.txt files in data/rma_raw/

For RMA, drop sobcov_2021.txtsobcov_2026.txt (extracted from the zips on the SoB page above) into data/rma_raw/. The pipeline auto-detects them by year and uses the union as historical context.

Run the dashboard

streamlit run app.py

Run the tests

pytest -v

CI runs the same suite on every push to main and every PR.

Deploy (Render + Cloudflare)

The repo ships with a Dockerfile and render.yaml for one-click deploy to Render. Pre-built parquet caches are checked in (data/*.parquet ≈ 2 MB total + ~2.2 MB CONUS counties GeoJSON) so cold starts come up immediately without re-pulling NASS / USDM / CPC.

  1. Push the repo to GitHub (already at https://github.com/mcmillinanalytics/drought-exposure-tool).
  2. On Render → New → Web Service → connect your repo. Render reads render.yaml and builds via the Dockerfile. Pick the Starter plan ($7/mo) for always-on.
  3. Add a single environment variable in the Render dashboard:
    • NASS_API_KEY — your free NASS Quick Stats key (only needed for the weekly auto-refresh; the shipped caches work without it).
  4. After first deploy you'll get a URL like drought-exposure-tool.onrender.com.
  5. In Cloudflare DNS → add a CNAME droughtdrought-exposure-tool.onrender.com (proxied / orange cloud).
  6. Render → Settings → Custom Domains → add drought.mcmillinanalytics.com. Render auto-issues a Let's Encrypt cert.

Auto-refresh

.github/workflows/refresh-data.yml runs every Tuesday at 14:00 UTC and pulls fresh NASS, USDM, and CPC data. Requires a NASS_API_KEY GitHub secret (Settings → Secrets → Actions → New repository secret). On push, Render auto-redeploys. RMA stays manual — the raw sobcov_* files are gitignored, so refreshing means dropping the new year's file into data/rma_raw/ locally, running python -m pipeline.rma, and committing the updated data/rma_sob.parquet.

Project layout

data/
  *.parquet                  cached datasets (NASS, RMA, USDM, CPC) — checked in
  counties.geojson           CONUS counties (Alaska/Hawaii/territories stripped)
  rma_raw/                   drop the SoB Coverage .txt files here (gitignored)
  cpc_raw/                   downloaded CPC SDO shapefile (gitignored)
pipeline/
  nass.py                    NASS Quick Stats pulls (multi-year, 5 crops)
  rma.py                     RMA SoB Coverage parser (multi-year long format)
  usdm.py                    USDM API pull (current + year-ago snapshots)
  cpc.py                     CPC SDO shapefile → per-county category
  coverage.py                Multi-year smoothing + lag imputation
  join.py                    Merges all sources, computes Exposure + Forecast
tests/
  test_logic.py              Pure-function unit tests
  test_schemas.py            Validate committed parquets against expected shapes
  test_pipeline.py           End-to-end smoke against build_dataset
  test_smoke.py              Module-import + app.py compile checks
app.py                       Streamlit dashboard
.github/workflows/
  refresh-data.yml           Weekly Tuesday data refresh + auto-commit
  ci.yml                     Pytest on every push and PR
Dockerfile
render.yaml
requirements.txt             prod deps
requirements-dev.txt         pytest on top of prod deps

Known limitations / edge cases

  • Prevented-plant acres are not subtracted from the denominator. NASS AREA PLANTED does not net out prevented-plant acres reported later to FSA, so coverage in flood/PP-heavy years is biased downward.
  • Wheat is all-classes (winter + spring excl. durum), not type-split. RMA's SoB Coverage summary doesn't split wheat by Type. We pair RMA wheat with NASS winter + spring (excl. durum). Durum is ~3% of US wheat, concentrated in ND / MT, and is excluded because NASS doesn't expose WHEAT, DURUM - ACRES PLANTED at county level. Coverage in durum-heavy counties is therefore overstated by a few points.
  • Cotton: upland only. Pima / extra-long-staple cotton (~2% of US cotton, AZ / CA / NM specialty) is excluded.
  • Sorghum: grain only. Silage sorghum and hybrid seed sorghum are excluded as non-grain commodities.
  • Quantity unit filter (RMA). RMA reports some plans in non-acre units (yield-protection bushels). Only acre-denominated rows are summed into insured_acres.
  • FIPS join edge cases. Independent cities (VA), county-equivalents in Alaska/Hawaii, and city-county consolidations can produce mismatches between NASS, RMA, USDM, and CPC FIPS sets. Counties missing data in any source are excluded from the composite.
  • CPC coverage gaps. Counties whose centroid falls outside every SDO polygon (rare, but possible at the edges) get no forecast and render as gray on forecast layers.
  • Loss ratio is volatile year-to-year. A single closed year can push a county above 5.0 due to one weather event. Treat the loss ratio as a flag, not a verdict.

License

Sources are public-domain US government data. Code is provided as-is under the MIT license.

About

County-level ag market intelligence: combines USDA NASS planted acres, RMA Summary of Business, and US Drought Monitor into a Streamlit map + watch list of high-drought / low-coverage counties.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors