Skip to content

feat(docs): Add markdown-exec for executable code blocks#309

Open
chekos wants to merge 2 commits intomainfrom
feat/markdown-exec
Open

feat(docs): Add markdown-exec for executable code blocks#309
chekos wants to merge 2 commits intomainfrom
feat/markdown-exec

Conversation

@chekos
Copy link
Owner

@chekos chekos commented Mar 13, 2026

Summary

  • Adds markdown-exec plugin so Python code blocks in docs execute at build time, replacing manually-written output blocks
  • Converts 8 guide/quickstart pages (~66 code blocks) to use exec="on" source="tabbed-left" with per-page sessions
  • Adds --extra spatial to .readthedocs.yaml so geometry=True blocks can run during docs build
  • Requires CENSUS_API_KEY environment variable in ReadTheDocs admin (already configured)

Pages left unconverted due to pre-existing API bugs (get_pums 400s, load_variables wrong URL pattern, get_estimates 404s): variables, pums-microdata, population-estimates, survey-design.

Test plan

  • uv run mkdocs build --strict passes with no warnings
  • ReadTheDocs build succeeds with CENSUS_API_KEY env var set
  • Visually verify converted pages render output correctly
  • Confirm vegalite charts and Material annotations still render

🤖 Generated with Claude Code

Replace manually-written output blocks with code that runs at build time
via markdown-exec, so docs always reflect the current API behavior.

Converted pages: quickstart, acs-data, decennial-data, geography,
margins-of-error, migration-flows, multi-year, spatial.

Pages left unconverted due to pre-existing API bugs (get_pums 400s,
load_variables wrong URL pattern, get_estimates 404s): variables,
pums-microdata, population-estimates, survey-design.

Also adds --extra spatial to .readthedocs.yaml so geometry=True
blocks can execute during the docs build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 13, 2026

Claude finished @chekos's task in 2m 37s —— View job


PR Review: feat(docs): Add markdown-exec for executable code blocks

  • Gather context and read changed files
  • Review configuration changes (mkdocs.yml, pyproject.toml, .readthedocs.yaml)
  • Review converted documentation pages
  • Summarize findings

Overall this is a well-structured PR that dramatically reduces manually-maintained output blocks. The approach (hidden exec blocks alongside annotated display blocks in quickstart.md; source="tabbed-left" pattern elsewhere) is clean and idiomatic for markdown-exec. The configuration additions look correct. A few issues need attention before merging.

Issues

🔴 Critical: known-broken APIs used in exec blocks

The PR description explicitly calls out get_pums (400s) and get_estimates (404s) as broken, but both appear in exec blocks that will run during the docs build:

  • quickstart.md lines 259–270get_pums() in a hidden exec block. Even with # markdown-exec: hide, the code still executes. If it raises an exception, the build will either fail (--strict mode) or show a traceback in the output tab.
  • docs/guides/spatial.md lines 62–74get_estimates() in a source="tabbed-left" exec block. This is inside the guide that was ostensibly fully converted, but it calls one of the functions explicitly excluded from conversion.

If the local mkdocs build --strict passes, it's likely because CENSUS_API_KEY is not set locally, causing all API calls to fail with the same error, which markdown-exec may handle non-fatally. On ReadTheDocs with the key configured, these blocks will hit the broken API endpoints. Fix this →

🟡 Bug: missing print() in migration-flows.md

docs/guides/migration-flows.md line 68ca_flows.head() without print() will produce no visible output. markdown-exec captures stdout; a bare expression doesn't write to stdout in standard Python. The output tab will be blank.

# Current (produces no output):
ca_flows.head()

# Should be:
print(ca_flows.head())

Fix this →

🟡 Data inconsistency: Washington state 2020 Census population

Two "2020 Decennial Census Population by State" Vega-Lite charts have different values for Washington (FIPS 53):

  • docs/guides/decennial-data.md7705281 ✓ (correct 2020 Census figure)
  • docs/guides/spatial.md line 159 — 7614893 ✗ (wrong figure)

The spatial.md chart should use 7705281 to match the actual 2020 Census count. Fix this →

🟡 Performance: nationwide ZCTA query in geography.md

docs/guides/geography.md line 129get_acs("zcta", variables="B01003_001", year=2022) fetches all ~33,000 ZCTAs nationwide. No cache_table=True. This will be one of the slowest calls in the docs build, and will be repeated on every rebuild. Consider adding cache_table=True here and in other large exec blocks. Fix this →

🔵 Fragility: session variable reuse in spatial.md

In docs/guides/spatial.md, four tabbed exec blocks all in session="spatial" each redefine gdf with different data (ACS county, decennial state, estimates state, flows county). Then at line 195, print(gdf.crs) references the last value of gdf in the session—which will be the flows GeoDataFrame (last tab). This works incidentally because all PyPUMS geometry is NAD83, but it's fragile. If the tab order changes or if an earlier block fails and gdf isn't updated, the behavior becomes unpredictable.

Consider naming each result (gdf_acs, gdf_dec, etc.) or using separate sessions per tab.

🔵 Note on # markdown-exec: hide pattern

The pattern in quickstart.md and decennial-data.md of showing an annotated display block followed by a hidden exec block that runs the same code is valid and intentional. This works correctly assuming the underlying APIs return results.

Configuration: all good ✅

  • mkdocs.yml: markdown-exec placed correctly before mkdocstrings; pymdownx.tabbed with alternate_style: true required for source="tabbed-left" is already configured.
  • pyproject.toml: markdown-exec[ansi] is the right extra (includes ANSI color support for rich terminal output).
  • .readthedocs.yaml: --extra spatial correctly added so geometry=True blocks can run.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Mar 13, 2026

Claude finished @chekos's task in 2m 20s —— View job


PR Review: feat(docs): Add markdown-exec for executable code blocks

  • Gather context and read changed files
  • Review configuration changes (mkdocs.yml, pyproject.toml, .readthedocs.yaml)
  • Review converted documentation pages
  • Summarize findings

Overall this is a well-executed PR. Replacing ~370 lines of manually-maintained output blocks with live execution is a clear quality-of-life win. The session-based approach, consistent use of source="tabbed-left", and the hidden-block pattern for annotated code are all sound choices.

Configuration — looks good

  • mkdocs.yml: - markdown-exec cleanly inserted into the plugins list (line 69).
  • pyproject.toml: markdown-exec[ansi] — using the [ansi] extra is the right call to handle colored terminal output.
  • .readthedocs.yaml: Adding --extra spatial is necessary for the geometry=True blocks; correct.
  • uv.lock: Updated as required by CLAUDE.md; 31 lines added for the new dependency.

Issues found

Bug: missing print() in migration-flows.md

docs/guides/migration-flows.md line 68 calls ca_flows.head() without print(). markdown-exec captures stdout, not the REPL return value, so this will silently produce no output:

from pypums import get_flows

ca_flows = get_flows(
    "county",
    state="CA",
    year=2019,
)

ca_flows.head()   # ← no output will appear

All other guide pages use print(...) consistently. This line should be print(ca_flows.head()) or print(ca_flows.head().to_string()).

Fix this →


Risk: get_estimates and get_flows tabs in spatial.md execute at build time

docs/guides/spatial.md lines 62–88 include exec="on" blocks for get_estimates() and get_flows():

gdf = pypums.get_estimates(
    geography="state",
    product="population",
    vintage=2023,
    geometry=True,
)

The PR description explicitly lists get_estimates (404s) and get_flows as functions with pre-existing API bugs. If these calls fail during mkdocs build, the build will break on ReadTheDocs. The local mkdocs build --strict passing today does not guarantee they'll pass on ReadTheDocs (different API key, network conditions, Census API availability).

Recommendation: Either remove exec="on" from these two tabs and restore static output, or add exec="on" raise_on_error="false" if markdown-exec supports that, or document which tabs are known-broken.


Inconsistent pattern in decennial-data.md

Most blocks in this file use exec="on" source="tabbed-left" session="decennial", but the "Common patterns" section switches to exec="on" session="decennial" (no source="tabbed-left") with a preceding non-executing display block:

# Display block (not exec, has comments for readability)
race = pypums.get_decennial(...)   # (shown to user)

# Hidden execution block (produces output, no source tab)
```python exec="on" session="decennial"
# markdown-exec: hide
race = pypums.get_decennial(...)   # (runs but hidden)

This is functional but breaks the visual consistency established everywhere else in the file. Users will see a different tab layout for those sections. Consider applying source="tabbed-left" uniformly and using a comment in the exec block to explain why the code is duplicated, or simply convert these to exec="on" source="tabbed-left" directly.


Minor: income DataFrame fetched but unused in margins-of-error.md

Lines 118–135 of docs/guides/margins-of-error.md make an API call to fetch income with output="wide", but the variable is never used — the significance() call immediately below uses hardcoded values:

income = get_acs(
    "place",
    variables="B19013_001",
    state="CA",
    year=2022,
    output="wide",
)

# income is never referenced below
is_different = significance(85000, 78000, 4000, 5000, clevel=0.90)

This adds unnecessary build time and API calls. The income fetch was probably carried over from a previous version that used real data for the significance example. Either wire income into the calculation (extracting real estimates and MOEs from the DataFrame) or remove the fetch and just use the hardcoded example.

Fix this →


Summary

Severity Issue File
Bug ca_flows.head() missing print() → no output rendered migration-flows.md:68
Risk get_estimates/get_flows tabs use exec="on" despite known API bugs spatial.md:62-88
Style "Common patterns" section uses different pattern than rest of file decennial-data.md:411-495
Minor income DataFrame fetched but never used in significance example margins-of-error.md:118-135

The print() fix is a clear bug. The get_estimates/get_flows risk is worth addressing before merging to avoid a broken ReadTheDocs build. The rest are polish items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant