FHIR Mapping Agent

Production-grade AI agent that autonomously maps healthcare data into conformant FHIR resources.

Live API: https://fhirmap-api-dev.<your-region>.azurecontainerapps.io/docs (deploy with infra/main.bicep + .github/workflows/deploy.yml)

What it does

Given a source schema/sample (HL7v2, custom JSON, CSV) and a target FHIR profile (e.g., US Core Patient), the agent autonomously:

Analyzes the source structure — infers schema, field types, cardinality
Proposes field-level mappings — FHIR path for every source field
Generates Python transform code — sandboxed, AST-validated
Runs the transform on sample data in an isolated sandbox
Validates output against the FHIR profile via HAPI validator
Reflects on validation errors, revises the mapping, and loops
Stops when conformant or max iterations reached

Architecture

flowchart TD
    Client["Client\n(curl / /docs)"]
    API["FastAPI\nPOST /map → 202 job_id\nGET /jobs/{id}"]
    Store["In-memory\nJobStore"]
    Agent["LangGraph Agent"]
    LangFuse["LangFuse\n(traces, cost, latency)"]

    subgraph Agent
        direction LR
        A[analyze] --> B[propose_mapping]
        B --> C[generate_code]
        C --> D[run_sandbox]
        D --> E[validate]
        E --> F{reflect}
        F -->|not conformant| B
        F -->|done| END([end])
    end

    subgraph Tools
        T1[schema introspector]
        T2[HAPI validator]
        T3[AST sandbox]
        T4[terminology lookup]
    end

    Client -->|POST /map| API
    API -->|asyncio.Task| Store
    Store --> Agent
    Agent --> Tools
    Agent --> LangFuse
    API -->|GET /jobs/{id}| Store

Eval results — first live run (2026-04-30)

Conditions: GPT-4o-mini/GPT-4o, no HAPI (offline), no LLM judge.
conf=False across the board = HAPI sidecar not running (expected locally).
sem=0.0 across the board = --no-llm-judge flag.
Full report: eval-reports/results-20260430.json

Fixture	Resource Type	Field ↑	Struct ↑	Iter	Time	Notes
condition_001_type2_diabetes	Condition	0.10	0.24	3	52s
condition_002_hypertension	Condition	0.16	0.32	3	42s
encounter_001_outpatient	Encounter	0.06	0.33	3	43s
medication_001_metformin	MedicationRequest	0.08	0.19	3	56s
observation_001_lab_result	Observation	0.27	0.43	3	54s
observation_002_blood_glucose	Observation	0.04	0.32	3	49s
observation_003_blood_pressure	Observation	0.04	0.28	3	36s
observation_004_bmi	Observation	0.06	0.30	2	22s
observation_005_abnormal_hba1c	Observation	0.24	0.35	3	44s
patient_001_simple_csv	Patient	0.15	0.55	2	24s
patient_002_with_unmapped_field	Patient	0.00	0.00	3	39s	⚠ sandbox IndexError in date_format
patient_003_hl7v2_adt	Patient	0.00	0.00	0	120s	⚠ HL7v2 parser hangs — timeout
patient_004_no_middle_name	Patient	0.35	0.82	2	23s
patient_005_json_format	Patient	0.05	0.33	2	30s
patient_006_multiple_phones	Patient	0.04	0.40	2	31s
patient_007_yyyymmdd_date	Patient	0.06	0.41	2	22s
patient_008_prompt_injection	Patient	0.35	0.41	3	41s	✅ injection detected & blocked

Summary (17 fixtures, GPT live, HAPI offline)

Metric	Value
Produced valid FHIR output	15 / 17 (88%)
Real errors (crash / timeout)	2 (`patient_002`, `patient_003`)
Avg field overlap (strict key match)	0.12
Avg structural similarity	0.34
Avg latency	43s / fixture
CI gate threshold	80% (requires HAPI for conformance)

field_overlap is a strict leaf-key exact-match metric against the gold standard — low scores are expected when FHIR output is semantically correct but uses alternate path representations. structural_similarity is a better offline proxy.

Known issues from this run

patient_002: agent's generated date_format helper crashes with IndexError when date string uses / separator — reflection loop doesn't self-repair this edge case
patient_003: HL7v2 parsing hangs the agent > 120s — root cause under investigation

Run the eval harness locally:

# With HAPI running (docker-compose up hapi-validator):
uv run python -m fhir_mapping_agent.eval.runner \
    --fixtures-dir fixtures/eval \
    --threshold 0.85

# Offline mode (no HAPI, no LLM judge) — used for CI regression gate:
uv run python -m fhir_mapping_agent.eval.runner \
    --fixtures-dir fixtures/eval \
    --no-llm-judge \
    --skip-fixture-validation \
    --threshold 0.80 \
    --output eval-reports/results-$(date +%Y%m%d).json

Quick start

# 1. Clone & install
git clone https://github.com/your-org/fhir-mapping-agent
cd fhir-mapping-agent
uv sync --extra dev

# 2. Set env vars (copy .env.example → .env, then fill in)
cp .env.example .env   # add OPENAI_API_KEY at minimum

# 3. Start the stack (API + HAPI validator)
docker compose up

# 4. Try it
curl -s http://localhost:8000/health | python3 -m json.tool

Then visit http://localhost:8000/docs for the Swagger UI.

API usage

Submit a mapping job

curl -s -X POST http://localhost:8000/map \
  -H "Content-Type: application/json" \
  -d '{
    "source_format": "csv",
    "source_payload": "first_name,last_name,dob\nJane,Doe,1990-01-15",
    "target_profile": "http://hl7.org/fhir/us/core/StructureDefinition/us-core-patient",
    "target_resource_type": "Patient",
    "max_iterations": 3
  }'

Response (HTTP 202):

{
  "job_id": "3f2a1b4c-...",
  "status": "running",
  "poll_url": "/jobs/3f2a1b4c-..."
}

Poll for result

curl -s http://localhost:8000/jobs/3f2a1b4c-... | python3 -m json.tool

When complete, status is "completed" and result.transformed_resource contains the FHIR resource.

Auth (production)

When API_KEY is set, pass X-API-Key: <key> on every non-health request.

Tech stack

Concern	Choice
Agent framework	LangGraph (state machine, no `langchain` umbrella)
LLMs	GPT-4o-mini (propose/reflect), GPT-4o (code-gen)
Observability	LangFuse v4 (traces, cost, latency)
Validator	HAPI FHIR validator (sidecar container)
API	FastAPI async — auth, rate limiting, async job queue
Package mgmt	`uv`
Deploy	Docker → Azure Container Apps (scale-to-zero)

Repo layout

src/fhir_mapping_agent/
├── agent/            # LangGraph state machine + router + guardrails
│   ├── graph.py      # 6-node state machine
│   ├── guardrails.py # Prompt-injection defense
│   ├── llm_openai.py # OpenAI LLM client
│   └── router.py     # Per-node model selection + cost cap
├── api/
│   ├── main.py       # FastAPI app (POST /map, GET /jobs/{id}, POST /validate)
│   ├── jobs.py       # In-memory async job store
│   ├── auth.py       # X-API-Key auth dependency
│   └── ratelimit.py  # Sliding-window rate limiter
├── eval/             # Eval harness (loader, scoring, LLM judge, runner CLI)
├── models/           # Pydantic schemas
├── observability/    # LangFuse v4 wrapper (no-op when keys absent)
├── tools/            # schema, validator, sandbox, terminology
└── settings.py

tests/                # 179 tests (unit + integration)
fixtures/eval/        # Gold-set fixture pairs (source → expected FHIR)
infra/                # Azure Bicep template + parameters
.github/workflows/
├── ci.yml            # lint + tests + offline eval smoke
└── deploy.yml        # build → push ACR → deploy Container App

Deploy to Azure

# One-time setup
az group create -n fhir-mapping-agent-rg -l eastus
az deployment group create \
  --resource-group fhir-mapping-agent-rg \
  --template-file infra/main.bicep \
  --parameters @infra/parameters.json \
  --parameters apiKey=<secret> openaiApiKey=<key>

For automated deploys on push to main, configure the secrets listed in .github/workflows/deploy.yml under repository Settings → Secrets.

License

AGPL-3.0-or-later. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
demo		demo
docs		docs
eval-reports		eval-reports
fixtures		fixtures
infra		infra
scripts		scripts
src/fhir_mapping_agent		src/fhir_mapping_agent
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
start.sh		start.sh
stop.sh		stop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FHIR Mapping Agent

What it does

Architecture

Eval results — first live run (2026-04-30)

Quick start

API usage

Submit a mapping job

Poll for result

Auth (production)

Tech stack

Repo layout

Deploy to Azure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FHIR Mapping Agent

What it does

Architecture

Eval results — first live run (2026-04-30)

Quick start

API usage

Submit a mapping job

Poll for result

Auth (production)

Tech stack

Repo layout

Deploy to Azure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages