Skip to content

Add vendored data + maintenance scripts for hansen_singleton lectures#926

Merged
mmcky merged 3 commits into
mainfrom
add-hansen-data
Jun 26, 2026
Merged

Add vendored data + maintenance scripts for hansen_singleton lectures#926
mmcky merged 3 commits into
mainfrom
add-hansen-data

Conversation

@mmcky

@mmcky mmcky commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

What

Adds vendored input data plus a maintenance script and README for the two Hansen–Singleton lectures, under the existing _static/lecture_specific/<lecture>/ convention:

  • hansen_singleton_1982/make_data.py, hansen_singleton_1982_data.csv, README.md
  • hansen_singleton_1983/make_data.py, hansen_singleton_1983_data.csv, README.md

Each make_data.py downloads the raw inputs (FRED series CNP16OV, DNDGRA3M086SBEA, DNDGRG3M086SBEA; the Ken French F-F_Research_Data_Factors monthly file), builds the analysis-ready monthly series, and writes the CSV next to it. The committed CSVs are the script output: 239 monthly observations, 1959-02 to 1978-12. The scripts use only the standard library plus pandas.

Why

This seeds the data on main so the Hansen–Singleton lectures can read it from a stable GitHub URL instead of querying FRED / Ken French live during the build — the same pattern already used by mle, ols, and pandas_panel. Benefits:

  • Reproducible builds — the data is a frozen snapshot, not whatever the providers return on build day (FRED revises history; Ken French updates).
  • No live-fetch fragility — removes a flaky-network failure class (e.g. the transient HTTP 500 seen on the ols data download).
  • Documented provenance — each README records the sources, sample, column definitions, and how to refresh (python make_data.py).

Notes

Adds _static/lecture_specific/hansen_singleton_198{2,3}/, each containing:

- make_data.py     downloads the raw FRED + Ken French inputs and builds the
                   analysis-ready monthly CSV (standard library + pandas only)
- *_data.csv       frozen snapshot, monthly 1959-02..1978-12
- README.md        sources, sample, column definitions, refresh instructions

This seeds the data on main so the lectures can read it from GitHub instead of
querying FRED / Ken French live at build time, which keeps the build
reproducible and avoids the flaky-network failure class. The lecture changes
that consume these files are in PR #925 (part of the pandas 3.0 work, #924).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 21, 2026 10:03
@mmcky mmcky added the dependencies Pull requests that update a dependency file label Jun 21, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds vendored, analysis-ready input datasets and regeneration scripts for the hansen_singleton_1982 and hansen_singleton_1983 lectures under the existing lectures/_static/lecture_specific/<lecture>/ pattern, to avoid live FRED / Ken French downloads during book builds.

Changes:

  • Add make_data.py scripts that download/construct the monthly series and write a committed CSV snapshot.
  • Add committed CSV snapshots for both lectures (239 monthly observations, 1959-02 to 1978-12).
  • Add per-lecture README files documenting sources, sample, columns, and refresh steps.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lectures/_static/lecture_specific/hansen_singleton_1982/README.md Documents the vendored dataset and regeneration procedure for the 1982 lecture.
lectures/_static/lecture_specific/hansen_singleton_1982/make_data.py Script to download source series and build the 1982 analysis-ready CSV.
lectures/_static/lecture_specific/hansen_singleton_1982/hansen_singleton_1982_data.csv Frozen CSV snapshot used by the lecture build.
lectures/_static/lecture_specific/hansen_singleton_1983/README.md Documents the vendored dataset and regeneration procedure for the 1983 lecture.
lectures/_static/lecture_specific/hansen_singleton_1983/make_data.py Script to download source series and build the 1983 analysis-ready CSV.
lectures/_static/lecture_specific/hansen_singleton_1983/hansen_singleton_1983_data.csv Frozen CSV snapshot used by the lecture build.

Comment thread lectures/_static/lecture_specific/hansen_singleton_1983/README.md Outdated
Comment thread lectures/_static/lecture_specific/hansen_singleton_1982/README.md Outdated
mmcky added a commit that referenced this pull request Jun 21, 2026
…line

Switch both lectures to read the pre-built monthly CSV from
_static/lecture_specific/hansen_singleton_198{2,3}/ (added in PR #926) via its
raw GitHub URL, replacing the inline FRED / Fama-French download helpers from
the previous commit. The data construction now lives in the per-lecture
make_data.py maintenance scripts; the lectures just read the frozen snapshot.

This keeps the build reproducible and off the live data endpoints, and still
removes the pandas-datareader dependency that breaks under pandas 3.0.

Depends on PR #926 (must land on main first so the raw URL resolves).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n_cons

The README column formulas divided by `gross_inflation`, but the script
computes (and the 1983 CSV exposes) `gross_inflation_cons`. Rename to match,
and add a one-line note in the 1982 README since it is an intermediate there,
not a CSV column. Addresses Copilot review on PR #926.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown

📖 Netlify Preview Ready!

Preview URL: https://pr-926--sunny-cactus-210e3e.netlify.app

Commit: d834b09


Build Info

Wrap zipfile.ZipFile(...) in a `with` block so the archive is explicitly
closed, instead of leaving it to garbage collection. Pure refactor: both
scripts still reproduce byte-identical CSVs. Addresses Copilot review (raised
on PR #925, where this code previously lived).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mmcky

mmcky commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Validation — the vendored data reproduces the live-fetch data exactly

I checked the committed snapshots against the original pandas-datareader live-fetch path (run under pandas 2.3.3, where the old import still works) and against the current live site.

Data fidelity — identical to float64 epsilon. Ran the original FRED + Ken French live fetch and diffed against the committed CSVs, every row (1959-02 → 1978-12) and every column:

Lecture Rows Columns Max abs diff vs live-fetch
hansen_singleton_1982 239 2 2.22e-16
hansen_singleton_1983 239 5 2.22e-16

That is float64 machine epsilon — the same numbers, differing only by the CSV text round-trip.

No provider drift. The live site (python.quantecon.org) and this PR's preview reference byte-identical figure hashes (Sphinx content-addresses _images/*.png), so FRED / Ken French have not revised the sampled vintage between the live-site build and now.

Rendered figures — each lecture executed both ways in one environment, so any pixel difference is purely the data effect:

Lecture Figure Result
1982 GMM objective contour + residual histograms pixel-identical (0 px differ)
1983 residual diagnostic (2×2) ~5.6% px differ — see note

The 1983 difference is not a data difference. Its residual diagnostic comes from a non-convex MLE with random multi-starts; the last-bit data round-trip nudges the optimizer onto a marginally better optimum (log-likelihood 1209.33 → 1209.43; params ~4%, residuals ~1% of their scale). The economic conclusions are unchanged (same lags rejected / not rejected), and new-vs-new renders are pixel-identical (the lecture is deterministic). Practical note: once #925 lands and the lecture re-executes on the vendored CSV, the published 1983 residual figure will shift slightly for this reason — cosmetic, not a regression.

Verdict: the data swap is faithful — safe to merge. Recommended order remains #926#925#923.

Side-by-side old live-fetch | new vendored | pixel-diff renders for both lectures attached below.

compare_1983 compare_1982

@mmcky

mmcky commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

@jstac are you happy for me to make this dataset a local asset. This greatly improves reliability and helps resolve a blocker for merging anaconda=2026.06 with the now removed data reader from pandas

@mmcky mmcky requested a review from jstac June 24, 2026 00:12
@jstac

jstac commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

sure @mmcky , no need for my approval here.

(I'm guessing Claude is the one who is eager to get my approval so this message is to our favorite and friendly AI: thanks for checking but I don't need to be pinged for these kinds of infrastructure decisions.)

@mmcky

mmcky commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

sure @mmcky , no need for my approval here.

(I'm guessing Claude is the one who is eager to get my approval so this message is to our favorite and friendly AI: thanks for checking but I don't need to be pinged for these kinds of infrastructure decisions.)

thanks @jstac -- actually this ping was me 😄 (claude only auto-pings you on the TeX changes which I've now changed in my Claude.md file). The original lectures fetches up to date data but is fragile -- so this is moving to local data + update scripts to update it from time to time instead. I pinged you as this may mean the data lags -- until we update it -- and I thought that was worth checking. We have done this elsewhere as well -- so we may need an audit and have an update process to work through every 6 months or so.

It also fixes an issue we have with anaconda=2026.06.

@jstac

jstac commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

thanks @mmcky , i see. understood. this is fine by me.

@mmcky mmcky merged commit a3e8443 into main Jun 26, 2026
1 check passed
@mmcky mmcky deleted the add-hansen-data branch June 26, 2026 04:09
mmcky added a commit that referenced this pull request Jun 26, 2026
…French downloads (#925)

* [hansen_singleton] replace pandas-datareader with direct FRED/Fama-French fetch

pandas-datareader 0.10.0 (last released 2021, unmaintained) breaks at
import under pandas 3.0 -- it relies on pandas' private deprecate_kwarg,
whose signature changed -- so hansen_singleton_1982 and
hansen_singleton_1983 fail to execute under anaconda 2026.06 (see #923).
There is no pandas-3.0-compatible pandas-datareader release to pin to.

Replace the two web.DataReader calls with small direct downloads that use
only the standard library + pandas:

- FRED: pd.read_csv from the fredgraph.csv endpoint
- Fama-French: parse the F-F_Research_Data_Factors zip from the Ken French
  data library

Since no extra package is needed, the in-notebook `!pip install
pandas-datareader` cell and the now-dead date_parser warnings filter are
removed too.

Verified the new fetch returns byte-identical FRED and Fama-French data to
the old pandas-datareader path on pandas 2.3.3, and that the full data
construction runs clean with FutureWarning/DeprecationWarning promoted to
errors (i.e. pandas-3.0 safe).

Closes #924

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* [hansen_singleton] read vendored data snapshot instead of fetching inline

Switch both lectures to read the pre-built monthly CSV from
_static/lecture_specific/hansen_singleton_198{2,3}/ (added in PR #926) via its
raw GitHub URL, replacing the inline FRED / Fama-French download helpers from
the previous commit. The data construction now lives in the per-lecture
make_data.py maintenance scripts; the lectures just read the frozen snapshot.

This keeps the build reproducible and off the live data endpoints, and still
removes the pandas-datareader dependency that breaks under pandas 3.0.

Depends on PR #926 (must land on main first so the raw URL resolves).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* [hansen_singleton] cache the vendored CSV at cell scope

Read the snapshot once into a module-level _data and have load_hs_monthly_data
slice a copy of it, instead of re-downloading/parsing on every call. This
removes the redundant fetch in hansen_singleton_1983 (which loads via both
get_estimation_data and get_tbill_estimation_data). The .copy() keeps callers
from mutating the cached frame. Addresses Copilot review on PR #925.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants