Version 0.9.2
Model once. Reuse everywhere.
MD-DDL is a Markdown-native standard for defining what data means, where it comes from, and how it is governed — then generating physical artifacts from a single source of truth that humans and AI agents share.
md-ddl is: AI‑native · Human‑friendly · Version‑controlled · Semantically rich · Ready for automation
Read the spec: 1-Foundation.md or MD-DDL-Complete.md for single-file AI context
- Domain layer — domains, entities, enums, relationships, events, and constraints
- Source layer — source system declarations and column-level transformation rules (direct, derived, conditional, lookup, reconciliation, aggregation)
- Data products — source-aligned, domain-aligned, and consumer-aligned products declaring scope, shape, consumers, SLA, governance, and masking — driving automated artifact generation
- Governance — classification, PII, retention, regulatory scope, access roles, and masking strategies living with the model, not in a separate system
- Physical artifacts — dimensional star schemas, normalized 3NF DDL, wide-column schemas, knowledge graph (Cypher), JSON Schema, Parquet contracts
Start a new project using the bootstrap script — it sets up git, adds MD-DDL as a submodule, and installs the agent wrappers for your AI tool in one step.
Bash (macOS / Linux / WSL):
bash <(curl -fsSL https://raw.githubusercontent.com/Semprini/md-ddl/main/start-project.sh)PowerShell (Windows):
Invoke-Expression (Invoke-WebRequest https://raw.githubusercontent.com/Semprini/md-ddl/main/start-project.ps1).ContentOr download start-project.sh / start-project.ps1 and run them locally.
Learn by conversation: MD-DDL includes Agent Guide an AI learning companion available from the repo via Claude or CoPilot in VS Code. It adapts to your role and goals, teaches through discussion rather than documentation, and routes you to the right specialist agent when you're ready to work.
Example prompts (Claude AI uses /agent-guide, CoPilot uses @agent-guide):
/agent-guide I'm new to MD-DDL — walk me through the key concepts and help me get started.
@agent-guide I'm a data architect at a retail bank. We have 15+ legacy source systems and no canonical data model. Give me an overview of MD-DDL and help me decide where to start.
/agent-guide I need to model a Customer domain. We track individuals and business accounts. Walk me through the MD-DDL approach.
md-ddl is not rigid or dogmatic. A typical flow is:
- Position — discuss the architectural approach with Agent Architect: compare to alternatives, prepare material for governance councils or CIOs
- Discover — scope the domain with Agent Ontology: identify entities, relationships, events, and governance posture
- Model — write domain.md, entity files, enums, and events
- Map sources — declare source systems and column-level transforms
- Publish — declare data products with scope, shape, SLA, and masking
- Generate — produce physical artifacts with Agent Artifact
- Govern — audit standards conformance and regulatory posture with Agent Governance
Agent Guide helps you navigate between these stages and explains any concept along the way.
MD-DDL is designed to be used as a git submodule dependency. Your model files live in your own repository; MD-DDL provides the specification, agents, and examples.
If you prefer not to use the scripts and set up manually:
mkdir myproject
cd myproject
git init
git submodule add https://github.com/Semprini/md-ddl .md-ddl
git submodule update --initThen copy the agent wrappers for your AI tool:
- Copilot:
.md-ddl/.github/agents/*.agent.md→.github/agents/ - Claude:
.md-ddl/.claude/commands/*.md→.claude/commands/
If you use Claude, you need to update ./claude/commands/*.md files. The agents/ path needs to be .md-ddl/agents
Next, create your copilot-instructions.md or CLAUDE.md. See the start project scripts for examples.
Update MD-DDL to a new release later:
git submodule update --remote .md-ddlyour-project/
.md-ddl/ ← submodule (this repo)
.github/agents/ ← Copilot agent wrappers (Copilot users)
.claude/commands/ ← Claude slash commands (Claude users)
domains/
customer/
domain.md
entities/
products/
sources/
salesforce-crm/
source.md
transforms/
Five reference domains at increasing complexity:
| Example | Focus | Complexity |
|---|---|---|
| Simple Customer | Minimal — one domain, three entities, one event | Starter |
| Financial Crime | AML/KYC/CTF — BIAN alignment, 15+ entities, sources, products, generated artifacts | Intermediate |
| Healthcare | FHIR R4 — HIPAA governance, source transforms, knowledge-graph product | Intermediate |
| Telecom | TM Forum ODA — PCI-DSS, associative entities, new relationship types, dimensional product | Advanced |
| Retail Sales + Retail Service | Bounded Context — two greenfield domains defining Customer differently, cross-domain Customer 360 | Advanced |
The feature coverage matrix maps every spec feature to the example that demonstrates it.
md-ddl-specification/ Normative standard
1-Foundation.md Start here to understand the model
2-Domains.md … 10-Adoption.md
MD-DDL-Complete.md Single-file version for AI context windows
agents/ Canonical agent prompts and skills
agent-guide/ Learning companion and navigator
agent-ontology/ Domain modelling and source mapping
agent-artifact/ Physical schema generation
agent-architect/ Architecture philosophy, data product design, ODPS
agent-governance/ Standards conformance and compliance auditing
examples/ Reference examples
Simple Customer/
Financial Crime/
Healthcare/
Telecom/
Retail Sales/
Retail Service/
references/ Architecture and industry reference data
industry_standards/ BIAN, FHIR, TM Forum reference datasets
architecture/ Data Autonomy blog series, external references, Mermaid diagrams
This work is licensed under a Creative Commons Attribution 4.0 International License.
