Force on-demand GPU runners (spot=false) by mmcky · Pull Request #936 · QuantEcon/lecture-python.myst

mmcky · 2026-06-26T07:16:18Z

What

Adds spot=false to the RunsOn g4dn.2xlarge GPU runner spec in all four GPU workflows so they run on on-demand instances instead of spot: cache.yml, ci.yml, collab.yml, publish.yml.

Why

AWS spot reclamation has been interrupting the GPU notebook builds mid-run. Because each build runs on a single GPU for 15–25 minutes and can't cheaply checkpoint, a reclamation near the end discards the whole build — so the spot discount is often a net loss, and scheduled cache/publish builds fail intermittently for reasons unrelated to the lectures (surfacing as an AssertionError/shutdown-signal that looks like a content bug). We just hit exactly this on a forced cache.yml run here (attempt 1 reclaimed mid-build, no execution reports, auto re-queued).

This rolls out the org-wide decision in QuantEcon/meta#330 (on-demand for all GPU builds) to lecture-python.myst, and mirrors the reference change in QuantEcon/lecture-jax#327.

Change

Workflow	Runner	Change
`cache.yml`	g4dn.2xlarge	append `/spot=false`
`ci.yml`	g4dn.2xlarge	append `/spot=false`
`collab.yml`	g4dn.2xlarge	append `/spot=false`
`publish.yml`	g4dn.2xlarge	append `/spot=false`

One-line change per file; no other workflow logic touched.

🤖 Generated with Claude Code

The weekly link checker (#933) flags 8 errors out of ~25k links, all false positives or harmless artifacts on non-content pages: - IEEE Xplore returns "202 Accepted" (anti-bot) for a valid DOI cited in zreferences.html -> add 202 to --accept. - genindex / search / prf-prf are auto-generated utility pages with no source notebook, so the theme's "Download Notebook" button points at a nonexistent _notebooks/<page>.ipynb and renders a second href="None" -> --exclude-path those three pages. - A Journal of Derivatives DOI redirects into a login/paywall loop that exceeds max-redirects; the citation itself is valid -> --exclude it. Configuration is kept inline in the workflow args (rather than a lychee.toml) because lychee runs against the gh-pages checkout, which does not contain repo-root config files. Closes #933 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

lychee treats --exclude-path values as regular expressions, so the unescaped dots in genindex.html / search.html / prf-prf.html were regex wildcards and the patterns were unanchored. Escape the dot and anchor the end ('<name>\.html$') so each matches only the intended generated page. Addresses Copilot review feedback on #934. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

AWS spot reclamation has been interrupting the g4dn.2xlarge GPU notebook builds mid-run, discarding the whole build. Add spot=false to the RunsOn runner spec in all four GPU workflows (cache, ci, collab, publish) so they run on on-demand instances. Rolls out the org-wide decision in QuantEcon/meta#330; mirrors QuantEcon/lecture-jax#327. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Copilot

Pull request overview

This PR updates GitHub Actions workflow configuration to reduce GPU build interruptions by forcing on-demand GPU runner allocation (disabling spot instances). It also adjusts the scheduled link checker workflow’s lychee arguments to reduce known false positives when checking the published gh-pages HTML output.

Changes:

Append /spot=false to the runs-on runner spec for the g4dn.2xlarge GPU workflows (cache.yml, ci.yml, collab.yml, publish.yml) to ensure on-demand instances.
Expand linkcheck.yml lychee CLI arguments to accept additional HTTP statuses and exclude a small set of known-noise pages/DOI.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
.github/workflows/cache.yml	Forces on-demand `g4dn.2xlarge` runner usage via `/spot=false`.
.github/workflows/ci.yml	Forces on-demand `g4dn.2xlarge` runner usage via `/spot=false`.
.github/workflows/collab.yml	Forces on-demand `g4dn.2xlarge` runner usage via `/spot=false`.
.github/workflows/publish.yml	Forces on-demand `g4dn.2xlarge` runner usage via `/spot=false`.
.github/workflows/linkcheck.yml	Refines lychee linkcheck args to reduce documented false positives.

github-actions · 2026-06-26T07:41:43Z

📖 Netlify Preview Ready!

Preview URL: https://pr-936--sunny-cactus-210e3e.netlify.app

Commit: b8140a9

Build Info

Workflow: Build Project [using jupyter-book]

mmcky and others added 3 commits June 26, 2026 15:31

Copilot AI review requested due to automatic review settings June 26, 2026 07:16

Copilot started reviewing on behalf of mmcky June 26, 2026 07:16 View session

Merge branch 'main' into disable-spot-gpu-runners

b8140a9

Copilot AI reviewed Jun 26, 2026

View reviewed changes

mmcky merged commit 0567c05 into main Jun 26, 2026
1 check passed

mmcky deleted the disable-spot-gpu-runners branch June 26, 2026 07:53

This was referenced Jun 26, 2026

Full-build failures from in-lecture pip packages (prettytable 3.18.0, arviz) — fix in-lecture #935

Closed

⬆️ Bump anaconda from 2025.12 to 2026.06 #923

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Force on-demand GPU runners (spot=false)#936

Force on-demand GPU runners (spot=false)#936
mmcky merged 4 commits into
mainfrom
disable-spot-gpu-runners

mmcky commented Jun 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

Conversation

mmcky commented Jun 26, 2026

What

Why

Change

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📖 Netlify Preview Ready!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 26, 2026 •

edited

Loading