AINPI

Experimental explorer for the CMS National Provider Directory (NPD) public use files.

Work in progress. AINPI is research/educational. Data may be incomplete, stale, or incorrect. Every number should be verified against primary sources before any business or clinical decision. See the /insights page for a full provenance analysis.

What it does

CMS released the National Provider Directory as FHIR R4 NDJSON public use files from directory.cms.gov — 21.7M records across 6 resource types (Practitioner, PractitionerRole, Organization, OrganizationAffiliation, Location, Endpoint) in the May 2026-05-08 release. AINPI:

Ingests the full dataset into Google BigQuery, then runs ~30 pre-registered findings (H1–H39) against both the directory itself and federal claims/payment data (Medicaid, Medicare Part B + Part D, NPPES-deactivated × billing, Open Payments, DMEPOS, nursing-home ownership disclosures)
Serves an interactive US choropleth at / with a 3-style theme switcher, plus per-state CMO-facing pages built for the state Medicaid Director-letter response window
Cross-audits federal exclusion lists (OIG LEIE + SAM.gov) against the directory, against MMIS-flagged providers, and against federal claims data — closing 3 of the 4 federal database checks the 2026-04-23 CMS State Medicaid Director letter requires (NPPES + LEIE + SAM; SSA-DMF stays restricted)

What's new

2026-05-18 · PECOS-as-authoritative workstream (H37–H39). CMS designated PECOS as authoritative for Medicare enrollment. State Medicaid systems must demonstrate alignment under the 2026 verification rules. AINPI pre-registered three findings: PECOS PROVIDER_TYPE vs NPPES NUCC taxonomy disagreement (H37), the behavioral-health subset (H38, highest recoupment risk), and multi-state-enrollment NPIs with conflicting addresses (H39). See /pecos.
2026-05-17 · Map-first homepage. / is now an interactive US choropleth with 3 selectable styles (Light cards / Dark dashboard / Minimal map). Click a state for an inline side panel with the 5 claims-side findings, CSV download, and primary-source NPI verification.
2026-05-15 · For state Medicaid CMOs. New /for-state-medicaid/<state> per-state pages built for the state Medicaid CMO listserve audience. Count-and-action lede, no H-numbers, citation-ready for SMD-letter Elements 2 + 4.
2026-05-14 · Claims-side cross-audit shipped for all 50 states + DC + PR. H29–H36 — Medicaid spending, Medicare Part B / Part D, NPPES-deactivated × billing, Open Payments, DMEPOS, nursing-home ownership (Stage B via the CMS PPEF cross-walk), NDH completeness. 99.99984% NDH completeness against material Medicare Part B billers.
2026-05-08 · NDH May release ingested. First release-to-release deltas published. Endpoint −73%, Location −61%, OrgAffiliation +147% vs April. Two source-side schema breaks AINPI caught and patched.

Pages

Path	What it is
`/`	Map-first homepage. Click any state for inline findings. 3-style theme switcher.
`/for-state-medicaid`	Index of per-state CMO-facing pages. Forwardable for the SMD-letter response window.
`/findings`	Pre-registered findings (H1–H39). Each states null hypothesis + denominator before numbers drop
`/methodology`	Versioned audit methodology — DAMA DMBOK mapping, L0–L7 scoring, reproducibility commands
`/pecos`	PECOS-as-authoritative-source brief — implications of the 2026 verification rules
`/smd-revalidation`	Citation-ready language for the 2026-05-23 SMD-letter response (Elements 1–5)
`/data-quality`	D3 dashboard: choropleth, sankey, knowledge graph, drill-down, validation
`/insights`	Provenance + variance analysis (NPD vs published org numbers)
`/provider-search`	Real-time search against live payer FHIR directories
`/magic-scanner`	AI-augmented provider discovery
`/npd`	Public search by NPI, name, organization, state, city

Public URL contract

Static JSON, CDN-cached, safe to depend on across releases:

/api/v1/stats.json — site-wide counters, methodology version, commit SHA
/api/v1/findings/<slug>.json — one per finding (types)

Breaking changes bump the path (/api/v2/), not the shape in place.

Roadmap

Public roadmap lives in GitHub Issues tagged roadmap, grouped into three milestones:

Milestone	Scope
`v1.1`	Data refinements on existing findings — phonetic name match, dual-board atlas, per-state drill-downs, methodology v1.0 prose
`v1.2`	New findings — H16 address geocoding, H17 USPS drift, H19/H20 state scale, weekly endpoint-crawl host
`v2.0`	Expansion beyond NPD's current 6 resources — blocked until CMS ships InsurancePlan / HealthcareService / Network / Verification

Contributions welcome on any issue. File a new one using the issue templates.

Sibling repositories

Repo	Scope
`FHIR-IQ/ainpi-probe`	FHIR endpoint liveness crawler (L0–L7). Runs separately from the site so operators can audit the code that hits their endpoints.
`FHIR-IQ/ainpi-examples`	Python + DuckDB usage examples for the `/api/v1/*` contract.

What's in this repo

frontend/          Next.js 14 app — routes, API, charts, tests
pipeline/          DuckDB-over-Parquet validation pipeline (shard, edges, NPI Luhn, temporal)
docs/methodology/  Versioned methodology doc, rendered at /methodology
.github/           CI, CodeQL, dependabot, issue + PR templates, release workflow

Architecture

       ┌────────────────────────────────┐
       │ directory.cms.gov              │
       │ 6 NDJSON.zst files, 2.8 GB     │
       └──────────────┬─────────────────┘
                      │ scripts/ingest-cms-npd.ts
                      ▼
       ┌────────────────────────────────┐
       │ BigQuery (cms_npd dataset)     │
       │ resource:JSON + _* flat fields │
       │ 27.2M rows + 5 analytics views │
       └──────┬─────────────────────┬───┘
              │                     │
  live query  │                     │ scripts/sync-bq-to-supabase.ts
              │                     │  (nightly aggregation)
              ▼                     ▼
       ┌──────────────┐     ┌──────────────────┐
       │ Next.js API  │     │ Supabase Postgres│
       │ routes       │◄────┤ Prisma ORM       │
       │ on Vercel    │     │ pre-agg metrics  │
       └──────┬───────┘     │ user auth        │
              │             └──────────────────┘
              ▼
       ┌────────────────────────────────┐
       │ React + D3 dashboard           │
       │ FilterContext cross-filtering  │
       └────────────────────────────────┘

Why this split: BigQuery costs <$1/mo to hold 40 GB of FHIR JSON and gives free-tier-friendly analytics. Supabase is where the app's hot-path queries and auth data live. Pre-aggregations are synced nightly so the dashboard doesn't hit BigQuery on every page load.

Quickstart

cd frontend
npm install
cp .env.example .env.local   # fill in Supabase + GCP values
npm run db:push              # push Prisma schema to Supabase
npm run dev                  # http://localhost:3000

To reload the NPD warehouse (only needed when CMS publishes a new release):

npm run bq:setup     # Create dataset + tables + views (idempotent)
npm run bq:ingest    # Download from directory.cms.gov, stream into BigQuery
npm run bq:sync      # Aggregate BigQuery → Supabase metrics

Testing

npm run test         # Vitest — 62 unit tests
npm run test:e2e     # Playwright — 15 E2E specs

Covers FHIR reference extraction, API parameter parsing, validation contract, filter context hierarchy, NPI/URL regex, BigQuery schema, dashboard dropdown interactions, and search.

Documentation

CLAUDE.md — Architecture + developer reference
DATABASE_SETUP.md — Supabase + Prisma + BigQuery setup walkthrough

Key references

CMS National Provider Directory
HTE Data Release Specifications
NDH FHIR IG STU1 v1.0.0 (published) · STU2 CI build (tracked for upcoming changes; not authoritative)

Name		Name	Last commit message	Last commit date
Latest commit History 261 Commits
.github		.github
analysis		analysis
backend		backend
crawler		crawler
demo-data		demo-data
docs		docs
examples		examples
frontend		frontend
models		models
modules		modules
pipeline		pipeline
sample-data		sample-data
specs		specs
supabase		supabase
web-app		web-app
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
.vercelignore		.vercelignore
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DATABASE_SETUP.md		DATABASE_SETUP.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
deploy.sh		deploy.sh
package.json		package.json
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AINPI

What it does

What's new

Pages

Public URL contract

Roadmap

Sibling repositories

What's in this repo

Architecture

Quickstart

Testing

Documentation

Key references

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AINPI

What it does

What's new

Pages

Public URL contract

Roadmap

Sibling repositories

What's in this repo

Architecture

Quickstart

Testing

Documentation

Key references

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages