USDA NASS county-level corn, soybean, and wheat yield data — a free static JSON API served from the jsDelivr CDN, refreshed weekly from the NASS bulk file. Per-county point lookups in 2-22 KB. No API key, no rate limits, no auth. Public-domain agricultural data for data scientists, ML pipelines, agronomy notebooks, and ag-tech tooling. Sharded by (state, county, crop) so a single lookup downloads a small leaf instead of a multi-megabyte state shard.
A single (state, county, crop, year) lookup is two cached fetches: one index.json for discovery + freshness, then one tiny leaf for the actual data.
GET https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/index.json
GET https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/states/{fips}/counties/{county_code}/{crop_slug}.json
Pick the canonical series with one line: the producer pre-marks the series[] entry that matches class="ALL CLASSES" + prodn_practice="ALL PRODUCTION PRACTICES" + unit="BU / ACRE" with "canonical": true. Read values[year] off that series.
Example, "Story County, Iowa corn-grain yield 2024":
GET https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/states/19/counties/169/corn.json
→ series[s for s if s.canonical]
→ values["2024"] → 215.5
Working lookup, three flavors:
# Python
import json, urllib.request
url = "https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/states/19/counties/169/corn.json"
leaf = json.load(urllib.request.urlopen(url))
canonical = next(s for s in leaf["series"] if s.get("canonical"))
print(canonical["values"]["2024"]) # 215.5// JavaScript (Node 18+ or any modern browser)
const url = "https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/states/19/counties/169/corn.json";
const leaf = await fetch(url).then(r => r.json());
const canonical = leaf.series.find(s => s.canonical);
console.log(canonical.values["2024"]); // 215.5# curl + jq
curl -s "https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/states/19/counties/169/corn.json" \
| jq '.series[] | select(.canonical) | .values["2024"]'
# 215.5State and county codes are USDA / Census ANSI codes. Reference: https://www.census.gov/library/reference/code-lists/ansi.html.
For a state-wide scan of one crop (e.g., all counties for corn in Iowa), use the rollup so the whole answer comes in one fetch instead of ~99 cold leaf requests:
GET https://cdn.jsdelivr.net/gh/ProductOfAmerica/usda-county-yields@main/data/states/19/crops/corn.json
→ counties[county_code].series[s for s if s.canonical].values[year]
refreshed_at, source ETag, and source.publication_date live only in data/index.json. Leaves and rollups carry no timestamps so unchanged data does not cause weekly file rewrites.
A lookup spans index.json + per-county leaves + per-crop rollups, all served from @main. jsDelivr edges populate independently, so for up to ~12 h after a refresh an edge can return a fresh index.json alongside stale or missing leaves. Consumers needing point-in-time consistency should pin a commit SHA in the URL (@<sha> instead of @main); jsDelivr serves SHA-pinned URLs from immutable git history.
Point leaves are ~2 KB to ~22 KB (wheat is largest because NASS publishes WINTER + SPRING + ALL CLASSES variants across multiple production practices). State-wide rollups for one crop run ~50 KB to ~550 KB. index.json is ~30 KB. All well under jsDelivr's 20 MB per-file cap.
Cold-fetch latency from a non-edge region: ~500 ms to ~2 s. Warm-cache latency: <100 ms. The intended deployment pattern is a hot cache (e.g., Cloudflare 60-min TTL) in front of jsDelivr; this cache is the durable tier behind that.
The published tree:
data/
index.json ~30 KB discovery + refreshed_at + source
_audit/latest.json ~1 KB header_observed (maintainer audit)
states/{fips}/
meta.json ~3-8 KB county code -> name + crop list
counties/{county_code}/{crop_slug}.json ~2-22 KB POINT LEAF
crops/{crop_slug}.json ~50-550 KB STATE x CROP ROLLUP
data/index.json:
{
"schema_version": 2,
"product_name": "NASS county crop yields",
"refreshed_at": "2026-04-30T07:13:37Z",
"source": {
"url": "https://www.nass.usda.gov/datasets/qs.crops_20260430.txt.gz",
"last_modified": "Thu, 30 Apr 2026 07:13:37 GMT",
"etag": "...",
"publication_date": "2026-04-30",
"freshness_lag_days": 0
},
"states": {
"19": { "alpha": "IA", "name": "IOWA", "crops": ["corn", "soybeans", "wheat"], "county_count": 99 }
}
}data/states/{fips}/counties/{county_code}/{crop_slug}.json (point leaf, no timestamps):
{
"schema_version": 2,
"state": { "fips": "19", "alpha": "IA", "name": "IOWA" },
"county": { "code": "169", "name": "STORY" },
"commodity": { "slug": "corn", "desc": "CORN" },
"series": [
{
"class": "ALL CLASSES",
"prodn_practice": "ALL PRODUCTION PRACTICES",
"util_practice": "GRAIN",
"unit": "BU / ACRE",
"short_desc": "CORN, GRAIN - YIELD, MEASURED IN BU / ACRE",
"canonical": true,
"values": { "2024": 215.5, "2023": 211.4 },
"suppressed": { "1980": "D" },
"raw": {}
}
]
}- Year keys are strings, not numbers (JSON object key semantics).
valuescarries numeric yields.suppressedcarries NASS suppression codes (D,NA,S,X,Zper the NASS Quick Stats glossary).rawcarries any cell value that didn't parse as numeric or suppression code, with the original string preserved for forensic audit.- Multiple
seriesentries per commodity capture variants like corn-grain (BU/ACRE) vs corn-silage (TONS/ACRE). The producer pre-marks the canonical entry with"canonical": true; consumers should prefer that flag instead of replicating the filter. - Supported commodity slugs:
corn,soybeans,wheat. Notesoybeansis plural. header_observedis published once atdata/_audit/latest.json(not per state). NASS adding a new column logs here without aborting the refresh; renaming a depended-on column does abort.county.codeis always populated and zfilled to 3 digits. The legacy NASSansifield is dropped at the leaf level because it equalscodefor every populated row and is sometimes blank.- Machine-readable contract:
data/_schema/leaf.jsonis a JSON Schema 2020-12 document describing the point-leaf shape. Producers and consumers can both validate against it.
- SURVEY rows only (annual; CENSUS deferred until a consumer asks for it).
- CORN, SOYBEANS, WHEAT only. Other commodities ship when consumer demos require them.
- County-level YIELD only. State rollups, production, area-harvested, etc. are out of scope.
- Weekly, Monday at 09:17 UTC. Year-round safe vs NASS's ~03:13 ET publication time (covers EDT and EST).
- Healthchecks.io configured with 9-day grace window and alerts on two consecutive missed pings (single-Monday GitHub cron drops auto-recover the next week).
- Recovery from 60-day workflow auto-disable is one click on the Actions page (
Run workflow). - Worst-case staleness: ~7 to 8 days (one weekly refresh interval plus up to ~12 h CDN propagation). Steady-state is much fresher; this is the ceiling, not the typical case.
- State coverage is partial. A state appears under
data/states/only if NASS published county-level SURVEY yields for corn, soybeans, or wheat in the source file. As of this writing the cache holds 42 of 50 states. Absent FIPS:02(AK),09(CT),11(DC),15(HI),23(ME),25(MA),33(NH),44(RI),50(VT). Consumers should resolve fips and county codes fromdata/index.jsonrather than hard-coding. If CT later returns to NASS county output, expect planning-region codes per Federal Register 2022-12063 rather than the legacy county codes. - Suppressed values (
(D),(NA), etc.) are NOT silently null. They land in thesuppressedmap keyed by year. Consumers choose to surface, ignore, or fall back. - Soft-fail semantics: a missing
(state, county, crop)is an absence in the file, not an error. The cache makes no completeness guarantee; it reflects what NASS published in the most recent bulk file. - Eventual consistency: after a refresh, jsDelivr's
s-maxage=43200means up to ~12 h before edges return the new content. The multi-file fetch shape can briefly serve a freshindex.jsonalongside stale leaves on the same edge. Consumers needing bit-stable reads can pin a commit SHA in the URL (@<sha>instead of@main).
- Source:
qs.crops_YYYYMMDD.txt.gzfromhttps://www.nass.usda.gov/datasets/, refreshed business-daily by NASS. - Refresh: GitHub Actions cron runs
scripts/refresh.pyon a public repo (free). Tests run before refresh on every CI invocation. - Storage: this Git repo. Sharded JSON committed weekly via touched-only writes (unchanged leaves are not rewritten).
- CDN: jsDelivr-fronted GitHub raw, free, no auth.
- Monitoring: healthchecks.io free tier deadman switch (refresh ping). A separate CDN canary workflow runs Mondays at 22:00 UTC, fetches
data/index.jsonand a known point leaf from jsDelivr, asserts freshness within the 9-day grace window, and pings a second healthchecks URL on success.
Three validation gates and three inline guards before any publish:
- Gate 1: required columns present in NASS header (tolerant reader; extra columns OK).
- Gate 2: filtered row count within ±10% of last successful run (skipped on bootstrap).
- Gate 3: missing-canonical ratio within 5% (
CANONICAL_MISSING_TOLERANCE). Catches NASS structurally dropping the canonical variant for a crop. - Inline guard 1: slug collision check during commodity slugification.
- Inline guard 2: stale-file prune. Any leaf present in the prior tree but absent in this refresh is unlinked so the next git commit captures the deletion.
- Inline guard 3: leaf-shape assert at emit time. Every point leaf is structurally checked against
data/_schema/leaf.jsonbefore write. A producer regression aborts before any file lands on disk.
Failures abort before commit. The previous snapshot stays live; healthchecks alerts on the next ping miss.
# Run the refresh locally (downloads ~1 GB from NASS)
python scripts/refresh.py
# Run tests (no network)
python -m unittest tests.test_refreshCode: MIT (see LICENSE). Data published under data/ derives from USDA NASS public-domain sources.