Medallion

A semantic checkpointing layer for LLM agents that enables structured, versioned state management across sessions.

Overview

Medallion provides a framework-agnostic way to create, persist, and retrieve semantic checkpoints of LLM agent reasoning state. It treats reasoning state as a first-class artifact, allowing agents to resume work from previous sessions with full context.

Features

Structured Checkpoints: Versioned JSON-based medallion schema with meta, scope, summary, decisions, and open questions
Framework Agnostic: Works with LangChain, LangGraph, custom agents, and any Python LLM framework
SQLite Backend: Local file-based storage with efficient scope-based queries
Scope-Based Retrieval: Query medallions by graph nodes and tags with subset matching
Update Support: Update existing medallions with new evidence without duplication

Installation

git clone https://github.com/avimallick/medallion.git
cd medallion
pip install -e .

Requirements

Python 3.11 or higher
pydantic>=2.0
aiosqlite>=0.19.0

Quick Start

Basic Usage

from medallion import (
    SQLiteMedallionStore,
    StubMedallionLLM,
    MedallionScope,
    Evidence,
    checkpoint_session,
    load_medallions_for_scope,
)

# Initialize store and LLM
store = SQLiteMedallionStore("medallions.db")
llm = StubMedallionLLM()

# Define scope for this checkpoint
scope = MedallionScope(
    graph_nodes=["repo:muse", "module:cli"],
    tags=["project_state", "refactor"],
)

# Create checkpoint at session end
evidence = Evidence(
    session_summary="Implemented CLI module with command routing",
    transcripts=["User requested help command", "System showed help menu"],
    artefacts={"commands": ["help", "start", "stop"]},
)

medallion = checkpoint_session(store, llm, scope, evidence)
print(f"Created medallion: {medallion.meta.medallion_id}")

# Load checkpoints for scope
medallions = load_medallions_for_scope(store, scope, limit=10)
print(f"Found {len(medallions)} medallions for scope")

Loading Previous Checkpoints

# Load most recent checkpoints for a scope
scope = MedallionScope(
    graph_nodes=["repo:muse"],
    tags=["project_state"],
)

medallions = load_medallions_for_scope(store, scope, limit=5)

for medallion in medallions:
    print(f"{medallion.meta.medallion_id}: {medallion.summary.high_level}")
    if medallion.decisions:
        print(f"  Decisions: {len(medallion.decisions)}")
    if medallion.open_questions:
        print(f"  Open questions: {len(medallion.open_questions)}")

Updating Checkpoints

# Subsequent checkpoint with same scope updates existing medallion
new_evidence = Evidence(
    session_summary="Added error handling to CLI module",
    transcripts=["User triggered error", "System handled gracefully"],
    artefacts={"error_rate": 0.01},
)

# This will update the existing medallion, not create a new one
updated_medallion = checkpoint_session(store, llm, scope, new_evidence)
print(f"Updated medallion: {updated_medallion.meta.medallion_id}")
print(f"Created: {updated_medallion.meta.created_at}")
print(f"Updated: {updated_medallion.meta.updated_at}")

Core Concepts

Medallion

A medallion is a structured checkpoint containing:

Meta: ID, timestamps, schema version, model info
Scope: Graph nodes and tags for querying
Summary: High-level summary and subsystem breakdown
Decisions: Key decisions made during reasoning
Open Questions: Unresolved questions or blockers
Affordances: Recommended entry points, things to avoid, invariants

Scope Matching

Medallions are queried by scope using:

Graph Nodes: Subset matching (requested nodes must be subset of stored nodes)
Tags: Intersection matching (any tag overlap returns match)

Example:

# Stored medallion has: graph_nodes=["repo:muse", "module:cli", "module:store"]
# Query with: graph_nodes=["repo:muse", "module:cli"]
# ✅ Matches (requested nodes are subset of stored nodes)

# Stored medallion has: tags=["project_state", "refactor"]
# Query with: tags=["refactor"]
# ✅ Matches (tag overlap)

Evidence

Evidence is the input data for checkpoint creation:

session_summary: High-level summary of the session
transcripts: List of conversation or action transcripts
artefacts: Dictionary of relevant artefacts (file paths, IDs, etc.)

Integration Examples

With LangChain

from langchain.agents import AgentExecutor
from medallion import SQLiteMedallionStore, StubMedallionLLM, checkpoint_session

store = SQLiteMedallionStore("langchain_sessions.db")
llm = StubMedallionLLM()

# At session start: Load previous checkpoints
scope = MedallionScope(graph_nodes=["project:my-agent"], tags=["session"])
medallions = load_medallions_for_scope(store, scope)

# Inject context into agent
context = "\n".join([m.summary.high_level for m in medallions])

# Run agent
agent = AgentExecutor(...)
result = agent.run(context + "\nUser query: ...")

# At session end: Create checkpoint
evidence = Evidence(
    session_summary=result["summary"],
    transcripts=result["transcripts"],
    artefacts=result["artefacts"],
)
checkpoint_session(store, llm, scope, evidence)

With LangGraph

from langgraph.graph import StateGraph
from medallion import SQLiteMedallionStore, checkpoint_session, load_medallions_for_scope

store = SQLiteMedallionStore("langgraph_sessions.db")
llm = StubMedallionLLM()

def on_session_start(state):
    scope = MedallionScope(
        graph_nodes=[state["project_id"]],
        tags=["project_state"],
    )
    medallions = load_medallions_for_scope(store, scope)
    state["medallions"] = medallions
    return state

def on_session_end(state):
    scope = MedallionScope(
        graph_nodes=[state["project_id"]],
        tags=["project_state"],
    )
    evidence = Evidence(
        session_summary=state["summary"],
        transcripts=state.get("transcripts", []),
        artefacts=state.get("artefacts", {}),
    )
    checkpoint_session(store, llm, scope, evidence)
    return state

# Build graph
graph = StateGraph(...)
graph.add_node("load_checkpoint", on_session_start)
graph.add_node("process", your_processing_node)
graph.add_node("save_checkpoint", on_session_end)

Custom Agent

from medallion import (
    SQLiteMedallionStore,
    MedallionLLM,
    load_medallions_for_scope,
    checkpoint_session,
    MedallionScope,
    Evidence,
)

class MyAgent:
    def __init__(self, store: SQLiteMedallionStore, llm: MedallionLLM):
        self.store = store
        self.llm = llm

    async def start_session(self, project_id: str):
        """Load medallions at session start."""
        scope = MedallionScope(
            graph_nodes=[f"repo:{project_id}"],
            tags=["project_state"],
        )
        medallions = load_medallions_for_scope(self.store, scope)
        
        # Inject medallion context into agent
        context = self._format_medallions(medallions)
        return context

    async def end_session(
        self,
        project_id: str,
        summary: str,
        transcripts: list[str],
        artefacts: dict,
    ):
        """Save checkpoint at session end."""
        scope = MedallionScope(
            graph_nodes=[f"repo:{project_id}"],
            tags=["project_state"],
        )
        evidence = Evidence(
            session_summary=summary,
            transcripts=transcripts,
            artefacts=artefacts,
        )
        return checkpoint_session(self.store, self.llm, scope, evidence)

    def _format_medallions(self, medallions: list[Medallion]) -> str:
        """Format medallions for agent context."""
        if not medallions:
            return "No previous checkpoints found."
        
        lines = ["Previous checkpoints:"]
        for m in medallions:
            lines.append(f"- {m.meta.medallion_id}: {m.summary.high_level}")
            if m.decisions:
                lines.append(f"  Decisions: {', '.join(d.statement for d in m.decisions)}")
            if m.open_questions:
                lines.append(f"  Open: {', '.join(q.question for q in m.open_questions)}")
        
        return "\n".join(lines)

API Reference

Core Types

Medallion: Main checkpoint data structure
MedallionScope: Query scope with graph nodes and tags
Evidence: Input data for checkpoint creation

Store

MedallionStore: Abstract interface for storage
SQLiteMedallionStore: SQLite implementation

LLM

MedallionLLM: Abstract interface for LLM operations
StubMedallionLLM: Stub implementation for testing

Session Helpers

checkpoint_session(store, llm, scope, evidence): Create or update checkpoint
load_medallions_for_scope(store, scope, limit=10): Load checkpoints for scope

Exceptions

MedallionError: Base exception
SchemaValidationError: Schema validation failures
StoreError: Storage operation failures
LLMError: LLM operation failures

Advanced Usage

Async API

For async frameworks, use the async versions directly:

from medallion.session import _checkpoint_session_async, _load_medallions_for_scope_async

# Async checkpoint creation
medallion = await _checkpoint_session_async(store, llm, scope, evidence)

# Async loading
medallions = await _load_medallions_for_scope_async(store, scope, limit=10)

Custom LLM Implementation

Implement the MedallionLLM protocol to integrate with your LLM:

from medallion.llm import MedallionLLM
from medallion.types import Medallion, MedallionScope, Evidence

class MyMedallionLLM:
    async def generate(self, scope: MedallionScope, evidence: Evidence) -> Medallion:
        # Call your LLM API
        # Parse response into Medallion
        # Return medallion
        pass

    async def update(self, existing: Medallion, new_evidence: Evidence) -> Medallion:
        # Call your LLM API to merge evidence
        # Update existing medallion
        # Return updated medallion
        pass

Context Manager Usage

async with SQLiteMedallionStore("medallions.db") as store:
    llm = StubMedallionLLM()
    
    scope = MedallionScope(graph_nodes=["repo:muse"], tags=["test"])
    evidence = Evidence(session_summary="Test session")
    
    medallion = await _checkpoint_session_async(store, llm, scope, evidence)
    # Database connection automatically closed on exit

Development

Running Tests

pytest

With coverage:

pytest --cov=medallion --cov-report=html

Type Checking

mypy medallion

Linting

ruff check medallion
black medallion

Documentation

Architecture

Module Overview

Medallion is organized into modular, composable components:

medallion/
├── types.py          # Core data models (Medallion, MedallionScope, Evidence)
├── store.py          # Storage interface (MedallionStore Protocol)
├── sqlite_store.py   # SQLite storage implementation
├── llm.py            # LLM interface and stub implementation
└── session.py        # High-level session helpers (checkpoint, load)

Component Relationships

┌─────────────────────────────────────────────────────────────┐
│                      Session Layer                           │
│  (checkpoint_session, load_medallions_for_scope)            │
│                  ┌──────────────────┐                        │
│                  │  session.py      │                        │
│                  └────────┬─────────┘                        │
└───────────────────────────┼──────────────────────────────────┘
                            │
        ┌───────────────────┴───────────────────┐
        │                                       │
┌───────▼────────┐                    ┌────────▼──────┐
│   Store Layer  │                    │   LLM Layer   │
│  (SQLiteStore) │                    │  (MedallionLLM)│
│                │                    │               │
│  ┌──────────┐  │                    │  ┌──────────┐ │
│  │store.py  │  │                    │  │  llm.py  │ │
│  └────┬─────┘  │                    │  └────┬─────┘ │
│       │        │                    │       │       │
│  ┌────▼─────┐  │                    │       │       │
│  │sqlite_   │  │                    │       │       │
│  │store.py  │  │                    │       │       │
│  └────┬─────┘  │                    │       │       │
└───────┼────────┘                    └───────┼───────┘
        │                                       │
        │                    ┌──────────────────┘
        │                    │
┌───────▼────────────────────▼─────────┐
│         Data Layer                   │
│    (Medallion, Evidence, etc.)       │
│                                      │
│  ┌──────────────────────────────┐   │
│  │        types.py              │   │
│  │  - Medallion (main schema)   │   │
│  │  - MedallionScope            │   │
│  │  - Evidence                  │   │
│  │  - Exceptions                │   │
│  └──────────────────────────────┘   │
└──────────────────────────────────────┘

Data Flow

Checkpoint Creation:

Evidence → LLM.generate() → Medallion → Store.create()

Checkpoint Update:

Evidence → Store.get_latest_for_scope() → LLM.update() → Store.update()

Checkpoint Retrieval:

MedallionScope → Store.get_latest_for_scope() → List[Medallion]

Design Principles

Dependency Injection: All dependencies (store, LLM) are injected, not hardcoded
Protocol-Based: Interfaces defined as typing.Protocol for flexibility
Async-First: All I/O operations are async, with sync wrappers for convenience
Framework-Agnostic: No framework-specific dependencies in core library
Type-Safe: Strict typing with mypy, Pydantic for runtime validation

License

MIT License - see LICENSE file for details.

Contributing

Contributions welcome! Please see CONTRIBUTING.md for guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.cursor		.cursor
.github/workflows		.github/workflows
.specify		.specify
docs		docs
medallion		medallion
specs/001-medallion-v1		specs/001-medallion-v1
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TESTING.md		TESTING.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Medallion

Overview

Features

Installation

Requirements

Quick Start

Basic Usage

Loading Previous Checkpoints

Updating Checkpoints

Core Concepts

Medallion

Scope Matching

Evidence

Integration Examples

With LangChain

With LangGraph

Custom Agent

API Reference

Core Types

Store

LLM

Session Helpers

Exceptions

Advanced Usage

Async API

Custom LLM Implementation

Context Manager Usage

Development

Running Tests

Type Checking

Linting

Documentation

Architecture

Module Overview

Component Relationships

Data Flow

Design Principles

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages