Skip to content

fix(usgs_glm): fix ScienceBase User-Agent and URL field (Spike 4)#48

Merged
JacobSampson merged 1 commit into
mainfrom
agent/frontend-dev/754016ea
Jun 7, 2026
Merged

fix(usgs_glm): fix ScienceBase User-Agent and URL field (Spike 4)#48
JacobSampson merged 1 commit into
mainfrom
agent/frontend-dev/754016ea

Conversation

@JacobSampson

Copy link
Copy Markdown
Contributor

Summary

  • ScienceBase returns 503 for requests without a User-Agent header — adds a session with AprovanLabs-DataScience/1.0 (requests) to fix the root cause of Spike 4 failures
  • Corrects URL field lookup: ScienceBase uses the url field, not downloadUrl or uri — all three call sites updated (manifest fetch, crosswalk CSV, NetCDF zip)

Spike 4 findings (APR-181)

ScienceBase status: API catalog endpoint is now reachable (HTTP 200 with proper User-Agent). File downloads are currently returning 503 (separate, transient service issue — not the same as the missing header bug).

Dataset confirmed:

  • lake_id_crosswalk.csv has MNDOW_ID column → maps MN DOW IDs to USGS site_id (NHD format)
  • lake_temp_preds_GLM_NLDAS.zip (~3.9 GB) contains depth-resolved daily predictions 1979–2022
  • Coverage: 185K+ Midwest lakes including MN

Comparison — ScienceBase GLM vs Planetary Computer STAC:

ScienceBase GLM STAC (Planetary Computer)
Coverage 1979–2022 historical 2013–present (recent)
Frequency Daily modeled ~16-day Landsat overpass
Depth Full water column Surface only
Cloud impact None (model) Cloud gaps
Latency Historical only Near-real-time
Size ~4 GB zip On-demand fetch

Recommendation: Use ScienceBase GLM for historical trend charts (complete, no gaps, depth-resolved). Continue using STAC for recent/live surface temperature observations. These are complementary, not competing.

Test plan

  • Verify fetch_from_sciencebase() reaches ScienceBase catalog without 503
  • Confirm crosswalk CSV download succeeds when file downloads are restored
  • Confirm GLM NetCDF zip download resolves correct URL

🤖 Generated with Claude Code

ScienceBase blocks requests without a User-Agent and returns 503 --
this was the root cause of the Spike 4 failures on 2026-06-06.

Also corrects the URL field lookup: ScienceBase returns download URLs
in the 'url' field, not 'downloadUrl' or 'uri' as the code assumed.
All three request sites (manifest, crosswalk, NetCDF zip) are updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codesandbox

codesandbox Bot commented Jun 7, 2026

Copy link
Copy Markdown

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

@JacobSampson JacobSampson merged commit 1d0ecb6 into main Jun 7, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant