GitHub - fraware/provable-discovery-memory: Make scientific discovery results portable, verifiable, and durable across teams, toolchains, and time

######################################################################################################################################
#                                                                                                                                    #
#   ____                      _     _         ____  _                                      __  __                                    #
#  |  _ \ _ __ _____   ____ _| |__ | | ___   |  _ \(_)___  ___ _____   _____ _ __ _   _   |  \/  | ___ _ __ ___   ___  _ __ _   _    #
#  | |_) | '__/ _ \ \ / / _` | '_ \| |/ _ \  | | | | / __|/ __/ _ \ \ / / _ \ '__| | | |  | |\/| |/ _ \ '_ ` _ \ / _ \| '__| | | |   #
#  |  __/| | | (_) \ V / (_| | |_) | |  __/  | |_| | \__ \ (_| (_) \ V /  __/ |  | |_| |  | |  | |  __/ | | | | | (_) | |  | |_| |   #
#  |_|   |_|  \___/ \_/ \__,_|_.__/|_|\___|  |____/|_|___/\___\___/ \_/ \___|_|   \__, |  |_|  |_|\___|_| |_| |_|\___/|_|   \__, |   #
#                                                                                  |___/                                     |___/   #
#                                                                                                                                    #
######################################################################################################################################

Provable Discovery Memory is a schema-first, verification-oriented scientific discovery platform. It turns backend-generated candidates into auditable artifacts with deterministic validation, benchmark evidence, formal proof obligations, and release manifests with checksums.

Inspired by IBM AI-Descartes, this repository extends that research direction into a contract-driven, reproducible engineering platform.

It is designed to separate discovery generation from discovery validation:

backends generate candidate laws,
canonical schemas enforce artifact contracts,
formal emission encodes proof obligations,
kernels and benchmarks evaluate quality and regressions,
publish/export flows make outputs reproducible and inspectable.

Vision

The long-term vision is to make scientific discovery results portable, verifiable, and durable across teams, toolchains, and time.

In many research stacks, even strong results can be difficult to trust or reuse because:

the data contract is implicit,
transformations are hidden inside scripts,
proof claims are disconnected from formal artifacts,
benchmark claims are difficult to reproduce,
release artifacts cannot be independently audited.

Why This Exists

Most discovery pipelines fail at handoff time: data contracts remain implicit, transformations stay hidden, and quality claims are difficult to verify independently.

Provable Discovery Memory addresses this by treating discovery as a governed artifact lifecycle:

canonical schemas define what each artifact means,
adapters normalize heterogeneous backend outputs into shared contracts,
formal emission links statements to machine-checkable obligations,
deterministic kernels and benchmarks provide repeatable quality signals,
publish manifests with checksums provide offline verifiability.

The goal is not just better model outputs.
The goal is reliable scientific memory: results that can be inspected, re-run, compared, and inherited.

Potential

If this architecture matures and achieves broad adoption, it can become a common interoperability layer for theory-guided discovery systems:

Backend interoperability: compare multiple discovery engines on the same canonical problems and metrics.
Reproducible science operations: move from ad-hoc experiments to repeatable, auditable release flows.
Formal traceability: tie confidence claims to explicit proof status rather than narrative summaries.
Cross-team portability: hand off artifacts across organizations without losing semantics.
Long-horizon benchmarking: track progress and regressions over time with stable contracts and thresholds.

This creates a path from promising one-off runs to cumulative scientific infrastructure.

Core Capabilities

Adapter interoperability: supports AI-Descartes import/export and additional backends (local_json, http_discovery).
Canonical contract enforcement: validates corpus and artifacts against JSON schemas.
Formal traceability: emits Lean obligations and tracks proof status.
Benchmark assurance: evaluates deterministic quality metrics against baseline thresholds.
Release governance: produces signed-off bundles/manifests with integrity checks.
Portal export: generates consumable views for downstream UI/reporting.

Architecture at a Glance

Backend output
  -> adapter normalization
  -> canonical artifacts (schema-validated)
  -> formal emission (proof obligations + Lean)
  -> benchmark report (threshold-gated)
  -> publish bundle + release manifest (checksums)
  -> portal export

Why We Need the Community

This project spans schemas, adapters, formal methods, benchmarking, and release operations. No single team can sustainably cover the full surface area of domain knowledge, backend diversity, and operational hardening.

Community participation is essential for:

More backend adapters: broaden real-world applicability beyond current integrations.
Richer formalization: extend supported statement grammars and proof workflows safely.
Stronger kernels and benchmarks: improve scientific signal quality and failure detection.
Domain-specific corpora: contribute canonical problem families with high-quality metadata.
Operational resilience: harden CI/release workflows and observability for production reliability.
Documentation quality: keep guidance accurate as architecture and capabilities evolve.

What We Ask from Contributors

Keep changes schema-first and contract-preserving.
Add deterministic tests for new behavior and edge cases.
Treat proof and benchmark claims as evidence-backed, not aspirational.
Prefer explicit failure over silent fallback.
Document assumptions, migration impacts, and operational implications.

Quick Start

Prerequisites

Python 3.11+
uv
Lean toolchain (elan, lake, lean)
Node 20+ and pnpm

Install

uv sync --extra dev --locked
pnpm --dir portal install --frozen-lockfile

Health and baseline checks

uv run pdm doctor
uv run pdm validate-corpus
uv run ruff check .
uv run mypy pipeline/src adapters kernels
uv run pytest -q
cd formal && lake build

End-to-End Workflows

1) AI-Descartes import/export

uv run pdm import-ai-descartes --src tests/fixtures/ai_descartes/export_golden/kepler_solar --problem-id kepler_solar
uv run pdm export-backend --problem-id kepler_solar --backend ai_descartes

2) Formal emission and benchmarking

uv run pdm emit-lean --problem-id kepler_solar
uv run pdm benchmark --problem-ids kepler_solar,langmuir_single_site

3) Publish and portal export

uv run pdm publish --problem-id kepler_solar
uv run pdm portal-export --out portal/public/portal_export.json
pnpm --dir portal build

For the complete operator runbook (including strict import mode, release checks, and troubleshooting), see:

docs/end-to-end-ai-descartes-offline-runbook.md

Repository Structure

schemas/ - canonical artifact contracts
corpus/ - canonical problems and suites
adapters/ - backend integrations (ai_descartes, local_json, http_discovery)
pipeline/ - orchestration and CLI
formal/ - Lean core, generated obligations, and formal tooling
kernels/ - deterministic quality metrics
benchmarks/ - benchmark thresholds and benchmark documentation
portal/ - UI and exported data consumption
scripts/ - certification, governance, and operational checks
artifacts/ - generated run/release outputs (not source-of-truth code)

Quality and Release Gates

Primary quality and release gates include:

schema validation (pdm validate-corpus, schema checks),
type/lint/test checks (mypy, ruff, pytest),
Lean build and generated obligation checks,
benchmark threshold enforcement (benchmarks/baseline_thresholds.json),
release governance and checksum verification.

Relevant workflows:

Documentation

Main docs index: docs/README.md
Architecture: docs/architecture.md
Engineering guardrails: docs/ENGINEERING_GUARDRAILS.md
Quality gates: docs/QUALITY_GATES.md
AI-Descartes layouts: docs/adapters/ai-descartes-layout.md
Formal grammar/CI:
- docs/formal/lean-obligation-grammar.md
- docs/formal/lean-ci-strategy.md

Contributing

Contributions are welcome across adapters, kernels, formalization, tests, and documentation.

Please read:

CONTRIBUTING.md

Expectations:

keep changes schema-first and contract-preserving,
include deterministic tests,
document behavioral or contract changes,
prefer explicit failures over silent fallback behavior.

Security

Please report vulnerabilities privately.

SECURITY.md

Code of Conduct

All contributors and maintainers are expected to follow:

CODE_OF_CONDUCT.md

License

This project is licensed under Apache-2.0.

LICENSE

Project Metadata

Package: provable-discovery-memory
Current version: 0.1.0 (from pyproject.toml)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
adapters		adapters
benchmarks		benchmarks
configs		configs
corpus		corpus
docs		docs
external		external
formal		formal
kernels		kernels
pipeline/src/pdm_pipeline		pipeline/src/pdm_pipeline
portal		portal
schemas		schemas
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
justfile		justfile
lean-toolchain		lean-toolchain
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision

Why This Exists

Potential

Core Capabilities

Architecture at a Glance

Why We Need the Community

What We Ask from Contributors

Quick Start

Prerequisites

Install

Health and baseline checks

End-to-End Workflows

1) AI-Descartes import/export

2) Formal emission and benchmarking

3) Publish and portal export

Repository Structure

Quality and Release Gates

Documentation

Contributing

Security

Code of Conduct

License

Project Metadata

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vision

Why This Exists

Potential

Core Capabilities

Architecture at a Glance

Why We Need the Community

What We Ask from Contributors

Quick Start

Prerequisites

Install

Health and baseline checks

End-to-End Workflows

1) AI-Descartes import/export

2) Formal emission and benchmarking

3) Publish and portal export

Repository Structure

Quality and Release Gates

Documentation

Contributing

Security

Code of Conduct

License

Project Metadata

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages