Skip to content

Add full reproduction and submission gates#6

Merged
Tuminha merged 3 commits into
mainfrom
checks/full-reproduction-and-submission-gates
Jun 2, 2026
Merged

Add full reproduction and submission gates#6
Tuminha merged 3 commits into
mainfrom
checks/full-reproduction-and-submission-gates

Conversation

@Tuminha

@Tuminha Tuminha commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Summary

  • add script-backed make reproduce, make temporal, make verify-submission, and make reproduce-full workflows
  • add lightweight GitHub Actions submission-readiness CI
  • regenerate canonical result artifacts from the full NHANES reproduction and update README, model card, manuscript, and line-numbered manuscript
  • track publication sensitivity tables with weighted prevalence and subgroup performance

Verification

  • make setup-lock in /tmp/nhanes-publication-repro
  • make reproduce-full in /tmp/nhanes-publication-repro
  • regenerated result artifacts match committed artifacts after ignoring volatile timestamp fields
  • make test
  • make consistency
  • make verify-submission
  • ./venv/bin/python -m compileall -q src scripts tests
  • git diff --check

Notes

  • Full NHANES data and logs remain ignored and local.
  • pandoc is not installed locally, so the PDF render check is skipped by make verify-submission.
  • No fork was used.

Summary by CodeRabbit

  • New Features

    • Added streamlined commands for quick submission checks and full end-to-end reproduction.
    • Added automated validation for temporal results, publication sensitivity tables, and supporting output artifacts.
  • Documentation

    • Refreshed the README, model card, and article draft with updated cohort counts, performance metrics, operating thresholds, and reproducibility steps.
    • Added publication sensitivity tables summarizing prevalence and subgroup performance.
  • Chores

    • Added automated submission-readiness checks in continuous integration.

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5b4fcb24-8d5c-4f86-90e0-0677c8794d84

📥 Commits

Reviewing files that changed from the base of the PR and between 36617cb and 5d68997.

📒 Files selected for processing (30)
  • .github/workflows/submission-readiness.yml
  • MODEL_CARD.md
  • Makefile
  • README.md
  • docs/publication/ARTICLE_DRAFT.md
  • reports/ARTICLE_DRAFT_line_numbered.txt
  • results/decision_curve_external.json
  • results/external_0910_metrics.json
  • results/external_summary.json
  • results/missingness_shift.json
  • results/prevalence_check.json
  • results/publication_sensitivity_tables.json
  • results/publication_sensitivity_tables.md
  • results/v13_featuredrop.json
  • results/v13_nan_ablation.json
  • results/v13_operating_points.json
  • results/v13_primary_norc_summary.json
  • results/v13_secondary_full_summary.json
  • results/v13_shap_summary.json
  • scripts/02_process_nhanes_data.py
  • scripts/04_publication_analyses.py
  • scripts/check_publication_consistency.py
  • scripts/download_nhanes.py
  • scripts/reproduce_v13_primary.py
  • scripts/run_external_validation.sh
  • scripts/run_temporal_validation.py
  • scripts/run_v13_primary.sh
  • scripts/verify_submission.py
  • src/reproduction.py
  • tests/test_reproduction_contract.py

📝 Walkthrough

Walkthrough

This PR establishes a reproducible internal benchmark and submission-readiness framework for the NHANES periodontitis prediction project. It introduces a new reproduction module with calibrated ensemble modeling, temporal validation via Python scripts, and submission-verification checks; updates the Makefile to standardize Python invocation; refreshes documentation and result artifacts with regenerated performance metrics reflecting new cohort sizes and model variants.

Changes

Reproduction and Validation Framework

Layer / File(s) Summary
Reproduction core module and utilities
src/reproduction.py
New module providing feature engineering, modeling-frame construction, calibrated ensemble fitting, cross-validated prediction generation, and metric summarization (ROC-AUC, PR-AUC, Brier, operating points) for reproducible benchmark generation.
Internal benchmark reproduction script
scripts/reproduce_v13_primary.py, tests/test_reproduction_contract.py
Script loads processed NHANES, builds modeling frame, and runs cross-validated predictions across multiple feature sets to generate internal benchmark artifacts; test contract validates feature counts and engineered features.
Temporal validation and wrapper updates
scripts/run_temporal_validation.py, scripts/run_external_validation.sh, scripts/run_v13_primary.sh
New script performs same-source temporal validation with bootstrap CIs; bash wrappers refactored to invoke Python scripts instead of notebooks.

Submission-Readiness Verification and CI/CD

Layer / File(s) Summary
Verification script and checks
scripts/verify_submission.py
New script performs JSON/YAML parsing, publication wording validation, temporal metric shape checking, NHANES URL reachability, and pandoc availability detection.
CI workflow and consistency checker updates
.github/workflows/submission-readiness.yml, scripts/check_publication_consistency.py
GitHub Actions workflow runs lightweight checks on push/PR; consistency checker refactored to dynamically extract canonical metrics from result artifacts and validate presence in publication docs.
Makefile verification targets
Makefile
New targets: verify-submission chains make test + consistency + submission scripts; reproduce-full orchestrates end-to-end pipeline with timestamped logging.

Workflow Standardization and Pipeline Updates

Layer / File(s) Summary
Makefile SHELL and PYTHON standardization
Makefile
New variables SHELL := /bin/bash and PYTHON ?= ./venv/bin/python; updated targets to reference $(PYTHON) instead of hardcoded paths; added new targets to .PHONY.
Updated build targets
Makefile
Targets test, consistency, download, process, train, reproduce, temporal, and manuscript now invoke scripts via $(PYTHON).
Data processing and framework integration
scripts/02_process_nhanes_data.py, scripts/04_publication_analyses.py, scripts/download_nhanes.py
Data processing imports and applies build_modeling_frame, expands variable mappings (height, mobile teeth), saves modeling frame alongside combined data; error message and docstring updates.

Documentation and Result Artifacts Update

Layer / File(s) Summary
Model card and README updates
MODEL_CARD.md, README.md
Revised development/temporal cohort sizes, new internal/temporal AUC-ROC/PR-AUC values, updated operating-point tables, reproducibility commands (make verify-submission, make reproduce-full).
Article draft regeneration
docs/publication/ARTICLE_DRAFT.md, reports/ARTICLE_DRAFT_line_numbered.txt
Updated cohort sizes, discrimination metrics, operating-point sensitivities/specificities, results narratives with survey-weighted prevalence and subgroup analysis.
Internal and temporal validation artifacts
results/v13_*.json, results/external_*.json, results/decision_curve_external.json, results/missingness_shift.json, results/prevalence_check.json
Regenerated metric files with updated performance statistics, operating points, feature importance, and diagnostic summaries.
Publication sensitivity tables
results/publication_sensitivity_tables.json, results/publication_sensitivity_tables.md
New artifacts containing prevalence by cycle, subgroup performance metrics, and missingness summaries for publication support.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A framework born from NHANES data deep,
With calibrated ensembles that validate and leap,
Bootstrap curves dance through temporal skies,
While submission checks ensure nothing denies—
Reproducible benchmarks for all to deploy! 🎯

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch checks/full-reproduction-and-submission-gates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Tuminha Tuminha merged commit 43cbe3c into main Jun 2, 2026
2 checks passed
@Tuminha Tuminha deleted the checks/full-reproduction-and-submission-gates branch June 2, 2026 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant