Add vendored data + maintenance scripts for hansen_singleton lectures#926
Conversation
Adds _static/lecture_specific/hansen_singleton_198{2,3}/, each containing:
- make_data.py downloads the raw FRED + Ken French inputs and builds the
analysis-ready monthly CSV (standard library + pandas only)
- *_data.csv frozen snapshot, monthly 1959-02..1978-12
- README.md sources, sample, column definitions, refresh instructions
This seeds the data on main so the lectures can read it from GitHub instead of
querying FRED / Ken French live at build time, which keeps the build
reproducible and avoids the flaky-network failure class. The lecture changes
that consume these files are in PR #925 (part of the pandas 3.0 work, #924).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds vendored, analysis-ready input datasets and regeneration scripts for the hansen_singleton_1982 and hansen_singleton_1983 lectures under the existing lectures/_static/lecture_specific/<lecture>/ pattern, to avoid live FRED / Ken French downloads during book builds.
Changes:
- Add
make_data.pyscripts that download/construct the monthly series and write a committed CSV snapshot. - Add committed CSV snapshots for both lectures (239 monthly observations, 1959-02 to 1978-12).
- Add per-lecture README files documenting sources, sample, columns, and refresh steps.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| lectures/_static/lecture_specific/hansen_singleton_1982/README.md | Documents the vendored dataset and regeneration procedure for the 1982 lecture. |
| lectures/_static/lecture_specific/hansen_singleton_1982/make_data.py | Script to download source series and build the 1982 analysis-ready CSV. |
| lectures/_static/lecture_specific/hansen_singleton_1982/hansen_singleton_1982_data.csv | Frozen CSV snapshot used by the lecture build. |
| lectures/_static/lecture_specific/hansen_singleton_1983/README.md | Documents the vendored dataset and regeneration procedure for the 1983 lecture. |
| lectures/_static/lecture_specific/hansen_singleton_1983/make_data.py | Script to download source series and build the 1983 analysis-ready CSV. |
| lectures/_static/lecture_specific/hansen_singleton_1983/hansen_singleton_1983_data.csv | Frozen CSV snapshot used by the lecture build. |
…line
Switch both lectures to read the pre-built monthly CSV from
_static/lecture_specific/hansen_singleton_198{2,3}/ (added in PR #926) via its
raw GitHub URL, replacing the inline FRED / Fama-French download helpers from
the previous commit. The data construction now lives in the per-lecture
make_data.py maintenance scripts; the lectures just read the frozen snapshot.
This keeps the build reproducible and off the live data endpoints, and still
removes the pandas-datareader dependency that breaks under pandas 3.0.
Depends on PR #926 (must land on main first so the raw URL resolves).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n_cons The README column formulas divided by `gross_inflation`, but the script computes (and the 1983 CSV exposes) `gross_inflation_cons`. Rename to match, and add a one-line note in the 1982 README since it is an intermediate there, not a CSV column. Addresses Copilot review on PR #926. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
📖 Netlify Preview Ready!Preview URL: https://pr-926--sunny-cactus-210e3e.netlify.app Commit: Build Info
|
Wrap zipfile.ZipFile(...) in a `with` block so the archive is explicitly closed, instead of leaving it to garbage collection. Pure refactor: both scripts still reproduce byte-identical CSVs. Addresses Copilot review (raised on PR #925, where this code previously lived). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Validation — the vendored data reproduces the live-fetch data exactlyI checked the committed snapshots against the original Data fidelity — identical to float64 epsilon. Ran the original FRED + Ken French live fetch and diffed against the committed CSVs, every row (1959-02 → 1978-12) and every column:
That is float64 machine epsilon — the same numbers, differing only by the CSV text round-trip. No provider drift. The live site ( Rendered figures — each lecture executed both ways in one environment, so any pixel difference is purely the data effect:
The 1983 difference is not a data difference. Its residual diagnostic comes from a non-convex MLE with random multi-starts; the last-bit data round-trip nudges the optimizer onto a marginally better optimum (log-likelihood 1209.33 → 1209.43; params ~4%, residuals ~1% of their scale). The economic conclusions are unchanged (same lags rejected / not rejected), and new-vs-new renders are pixel-identical (the lecture is deterministic). Practical note: once #925 lands and the lecture re-executes on the vendored CSV, the published 1983 residual figure will shift slightly for this reason — cosmetic, not a regression. Verdict: the data swap is faithful — safe to merge. Recommended order remains #926 → #925 → #923. Side-by-side
|
|
@jstac are you happy for me to make this dataset a local asset. This greatly improves reliability and helps resolve a blocker for merging anaconda=2026.06 with the now removed data reader from pandas |
|
sure @mmcky , no need for my approval here. (I'm guessing Claude is the one who is eager to get my approval so this message is to our favorite and friendly AI: thanks for checking but I don't need to be pinged for these kinds of infrastructure decisions.) |
thanks @jstac -- actually this ping was me 😄 (claude only auto-pings you on the TeX changes which I've now changed in my Claude.md file). The original lectures fetches up to date data but is fragile -- so this is moving to local data + update scripts to update it from time to time instead. I pinged you as this may mean the data lags -- until we update it -- and I thought that was worth checking. We have done this elsewhere as well -- so we may need an audit and have an update process to work through every 6 months or so. It also fixes an issue we have with anaconda=2026.06. |
|
thanks @mmcky , i see. understood. this is fine by me. |
…French downloads (#925) * [hansen_singleton] replace pandas-datareader with direct FRED/Fama-French fetch pandas-datareader 0.10.0 (last released 2021, unmaintained) breaks at import under pandas 3.0 -- it relies on pandas' private deprecate_kwarg, whose signature changed -- so hansen_singleton_1982 and hansen_singleton_1983 fail to execute under anaconda 2026.06 (see #923). There is no pandas-3.0-compatible pandas-datareader release to pin to. Replace the two web.DataReader calls with small direct downloads that use only the standard library + pandas: - FRED: pd.read_csv from the fredgraph.csv endpoint - Fama-French: parse the F-F_Research_Data_Factors zip from the Ken French data library Since no extra package is needed, the in-notebook `!pip install pandas-datareader` cell and the now-dead date_parser warnings filter are removed too. Verified the new fetch returns byte-identical FRED and Fama-French data to the old pandas-datareader path on pandas 2.3.3, and that the full data construction runs clean with FutureWarning/DeprecationWarning promoted to errors (i.e. pandas-3.0 safe). Closes #924 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * [hansen_singleton] read vendored data snapshot instead of fetching inline Switch both lectures to read the pre-built monthly CSV from _static/lecture_specific/hansen_singleton_198{2,3}/ (added in PR #926) via its raw GitHub URL, replacing the inline FRED / Fama-French download helpers from the previous commit. The data construction now lives in the per-lecture make_data.py maintenance scripts; the lectures just read the frozen snapshot. This keeps the build reproducible and off the live data endpoints, and still removes the pandas-datareader dependency that breaks under pandas 3.0. Depends on PR #926 (must land on main first so the raw URL resolves). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * [hansen_singleton] cache the vendored CSV at cell scope Read the snapshot once into a module-level _data and have load_hs_monthly_data slice a copy of it, instead of re-downloading/parsing on every call. This removes the redundant fetch in hansen_singleton_1983 (which loads via both get_estimation_data and get_tbill_estimation_data). The .copy() keeps callers from mutating the cached frame. Addresses Copilot review on PR #925. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>


What
Adds vendored input data plus a maintenance script and README for the two Hansen–Singleton lectures, under the existing
_static/lecture_specific/<lecture>/convention:hansen_singleton_1982/—make_data.py,hansen_singleton_1982_data.csv,README.mdhansen_singleton_1983/—make_data.py,hansen_singleton_1983_data.csv,README.mdEach
make_data.pydownloads the raw inputs (FRED seriesCNP16OV,DNDGRA3M086SBEA,DNDGRG3M086SBEA; the Ken FrenchF-F_Research_Data_Factorsmonthly file), builds the analysis-ready monthly series, and writes the CSV next to it. The committed CSVs are the script output: 239 monthly observations, 1959-02 to 1978-12. The scripts use only the standard library plus pandas.Why
This seeds the data on
mainso the Hansen–Singleton lectures can read it from a stable GitHub URL instead of querying FRED / Ken French live during the build — the same pattern already used bymle,ols, andpandas_panel. Benefits:olsdata download).python make_data.py).Notes
hansen_singleton_198{2,3}to read these files (and removes thepandas-datareaderdependency that breaks under pandas 3.0) is in PR [hansen_singleton] Replace pandas-datareader with direct FRED / Fama-French downloads #925, part of the pandas 3.0 work tracked in pandas 3.0 (anaconda 2026.06) breaks hansen_singleton lectures via pandas-datareader #924. Merge this PR first, then [hansen_singleton] Replace pandas-datareader with direct FRED / Fama-French downloads #925.