Skip to content

[hansen_singleton] Replace pandas-datareader with direct FRED / Fama-French downloads#925

Merged
mmcky merged 3 commits into
mainfrom
fix-hansen-pandas-datareader
Jun 26, 2026
Merged

[hansen_singleton] Replace pandas-datareader with direct FRED / Fama-French downloads#925
mmcky merged 3 commits into
mainfrom
fix-hansen-pandas-datareader

Conversation

@mmcky

@mmcky mmcky commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Closes #924.

Stacked on #926 — merge that first. This PR reads a vendored data snapshot that #926 adds to main. Until #926 lands, this PR's CI will fail (the …/main/… data URL 404s); it goes green once #926 is merged.

Problem

hansen_singleton_1982 and hansen_singleton_1983 fail to execute under anaconda 2026.06 / pandas 3.0 (surfaced by the forced full execution in #923). Both !pip install pandas-datareader and import it, but pandas-datareader 0.10.0 (unmaintained since 2021) relies on the private pandas API pandas.util._decorators.deprecate_kwarg, whose signature changed in pandas 3.0, so it dies at import:

TypeError: deprecate_kwarg() missing 1 required positional argument: 'new_arg_name'

Approach

Following discussion, instead of fetching from the data providers at build time, the data is vendored:

  • Add vendored data + maintenance scripts for hansen_singleton lectures #926 adds _static/lecture_specific/hansen_singleton_198{2,3}/ — a make_data.py maintenance script (builds the dataset from FRED + Ken French), the frozen *_data.csv snapshot, and a README.md.
  • This PR removes the pandas-datareader dependency (and the inline fetch helpers) and collapses each lecture's hidden data cell to a single pd.read_csv(<raw GitHub URL>) of that snapshot, selecting the columns it needs.

This matches the existing vendored-data convention used by mle, ols, and pandas_panel, keeps the build reproducible, and removes the live-fetch fragility (the flaky-network class that also bit ols).

Verification

Check Result
make_data.py output vs old pandas-datareader path (pandas 2.3.3) byte-identical FRED + Fama-French data
Lectures' code cells reading the vendored CSV, FutureWarning/DeprecationWarning → error run clean (pandas-3.0 safe)
Resulting frames 239 rows (1959-02 → 1978-12), expected columns, moments unchanged

Net effect on the lectures: −236 / +34 lines — the data machinery moves out to the maintenance scripts.

Note

The two ar1_* lectures that also fail under a forced run are a separate, pre-existing arviz issue, out of scope here.

…ench fetch

pandas-datareader 0.10.0 (last released 2021, unmaintained) breaks at
import under pandas 3.0 -- it relies on pandas' private deprecate_kwarg,
whose signature changed -- so hansen_singleton_1982 and
hansen_singleton_1983 fail to execute under anaconda 2026.06 (see #923).
There is no pandas-3.0-compatible pandas-datareader release to pin to.

Replace the two web.DataReader calls with small direct downloads that use
only the standard library + pandas:

- FRED: pd.read_csv from the fredgraph.csv endpoint
- Fama-French: parse the F-F_Research_Data_Factors zip from the Ken French
  data library

Since no extra package is needed, the in-notebook `!pip install
pandas-datareader` cell and the now-dead date_parser warnings filter are
removed too.

Verified the new fetch returns byte-identical FRED and Fama-French data to
the old pandas-datareader path on pandas 2.3.3, and that the full data
construction runs clean with FutureWarning/DeprecationWarning promoted to
errors (i.e. pandas-3.0 safe).

Closes #924

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 21, 2026 09:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Hansen–Singleton 1982/1983 lecture notebooks to remove the runtime dependency on pandas-datareader (which is incompatible with pandas 3.0), replacing it with direct downloads from FRED (CSV endpoint) and the Ken French data library (zip + CSV parsing) using only the standard library and pandas.

Changes:

  • Removed the in-notebook !pip install pandas-datareader and the pandas_datareader import usage.
  • Added small in-notebook helpers to download/parse FRED series and monthly Fama–French factors directly.
  • Updated lecture text to reflect the new data sources (FRED + Ken French) and the direct-download approach.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
lectures/hansen_singleton_1982.md Replaces pandas-datareader-based FRED/Fama–French fetching with direct downloads and parsing.
lectures/hansen_singleton_1983.md Same migration as 1982 lecture, keeping the constructed estimation dataset consistent while avoiding pandas 3.0 breakage.

Comment thread lectures/hansen_singleton_1983.md Outdated
Comment thread lectures/hansen_singleton_1982.md Outdated
…line

Switch both lectures to read the pre-built monthly CSV from
_static/lecture_specific/hansen_singleton_198{2,3}/ (added in PR #926) via its
raw GitHub URL, replacing the inline FRED / Fama-French download helpers from
the previous commit. The data construction now lives in the per-lecture
make_data.py maintenance scripts; the lectures just read the frozen snapshot.

This keeps the build reproducible and off the live data endpoints, and still
removes the pandas-datareader dependency that breaks under pandas 3.0.

Depends on PR #926 (must land on main first so the raw URL resolves).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mmcky

mmcky commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

Updated to the vendored-data pattern discussed: the data + maintenance script + README now live in #926, and this PR just reads the snapshot from GitHub. Merge #926 first; this PR's CI will be red until then (the /main/ data URL 404s until #926 lands), after which I'll re-trigger and confirm green.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown

📖 Netlify Preview Ready!

Preview URL: https://pr-925--sunny-cactus-210e3e.netlify.app

Commit: 135e0dd

📚 Changed Lectures


Build Info

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread lectures/hansen_singleton_1983.md Outdated
Comment thread lectures/hansen_singleton_1982.md Outdated
Read the snapshot once into a module-level _data and have load_hs_monthly_data
slice a copy of it, instead of re-downloading/parsing on every call. This
removes the redundant fetch in hansen_singleton_1983 (which loads via both
get_estimation_data and get_tbill_estimation_data). The .copy() keeps callers
from mutating the cached frame. Addresses Copilot review on PR #925.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mmcky added a commit that referenced this pull request Jun 21, 2026
Wrap zipfile.ZipFile(...) in a `with` block so the archive is explicitly
closed, instead of leaving it to garbage collection. Pure refactor: both
scripts still reproduce byte-identical CSVs. Addresses Copilot review (raised
on PR #925, where this code previously lived).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mmcky

mmcky commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Heads-up from the data validation on #926 (full details there): the vendored series are identical to the live-fetch series to 2.2e-16, and the hansen_singleton_1982 figures render pixel-identical. One cosmetic effect to expect — when this PR re-executes hansen_singleton_1983 on the vendored CSV, its residual-diagnostic figure will shift slightly. That is the non-convex MLE multi-start landing on a marginally better optimum under the last-bit data round-trip (log-likelihood 1209.33 → 1209.43; same lags rejected / not rejected). Flagging so the figure diff on re-execution isn't a surprise; it is not a regression.

mmcky added a commit that referenced this pull request Jun 26, 2026
…#926)

* Add vendored data + maintenance scripts for hansen_singleton lectures

Adds _static/lecture_specific/hansen_singleton_198{2,3}/, each containing:

- make_data.py     downloads the raw FRED + Ken French inputs and builds the
                   analysis-ready monthly CSV (standard library + pandas only)
- *_data.csv       frozen snapshot, monthly 1959-02..1978-12
- README.md        sources, sample, column definitions, refresh instructions

This seeds the data on main so the lectures can read it from GitHub instead of
querying FRED / Ken French live at build time, which keeps the build
reproducible and avoids the flaky-network failure class. The lecture changes
that consume these files are in PR #925 (part of the pandas 3.0 work, #924).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* [hansen_singleton data] fix README: gross_inflation -> gross_inflation_cons

The README column formulas divided by `gross_inflation`, but the script
computes (and the 1983 CSV exposes) `gross_inflation_cons`. Rename to match,
and add a one-line note in the 1982 README since it is an intermediate there,
not a CSV column. Addresses Copilot review on PR #926.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* [hansen_singleton data] open the Fama-French zip with a context manager

Wrap zipfile.ZipFile(...) in a `with` block so the archive is explicitly
closed, instead of leaving it to garbage collection. Pure refactor: both
scripts still reproduce byte-identical CSVs. Addresses Copilot review (raised
on PR #925, where this code previously lived).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@mmcky mmcky merged commit 4ab3365 into main Jun 26, 2026
1 of 2 checks passed
@mmcky mmcky deleted the fix-hansen-pandas-datareader branch June 26, 2026 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pandas 3.0 (anaconda 2026.06) breaks hansen_singleton lectures via pandas-datareader

2 participants