Skip to content

fraware/provable-discovery-memory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
######################################################################################################################################
#                                                                                                                                    #
#   ____                      _     _         ____  _                                      __  __                                    #
#  |  _ \ _ __ _____   ____ _| |__ | | ___   |  _ \(_)___  ___ _____   _____ _ __ _   _   |  \/  | ___ _ __ ___   ___  _ __ _   _    #
#  | |_) | '__/ _ \ \ / / _` | '_ \| |/ _ \  | | | | / __|/ __/ _ \ \ / / _ \ '__| | | |  | |\/| |/ _ \ '_ ` _ \ / _ \| '__| | | |   #
#  |  __/| | | (_) \ V / (_| | |_) | |  __/  | |_| | \__ \ (_| (_) \ V /  __/ |  | |_| |  | |  | |  __/ | | | | | (_) | |  | |_| |   #
#  |_|   |_|  \___/ \_/ \__,_|_.__/|_|\___|  |____/|_|___/\___\___/ \_/ \___|_|   \__, |  |_|  |_|\___|_| |_| |_|\___/|_|   \__, |   #
#                                                                                  |___/                                     |___/   #
#                                                                                                                                    #
######################################################################################################################################

Provable Discovery Memory is a schema-first, verification-oriented scientific discovery platform. It turns backend-generated candidates into auditable artifacts with deterministic validation, benchmark evidence, formal proof obligations, and release manifests with checksums.

Inspired by IBM AI-Descartes, this repository extends that research direction into a contract-driven, reproducible engineering platform.

It is designed to separate discovery generation from discovery validation:

  • backends generate candidate laws,
  • canonical schemas enforce artifact contracts,
  • formal emission encodes proof obligations,
  • kernels and benchmarks evaluate quality and regressions,
  • publish/export flows make outputs reproducible and inspectable.

Vision

The long-term vision is to make scientific discovery results portable, verifiable, and durable across teams, toolchains, and time.

In many research stacks, even strong results can be difficult to trust or reuse because:

  • the data contract is implicit,
  • transformations are hidden inside scripts,
  • proof claims are disconnected from formal artifacts,
  • benchmark claims are difficult to reproduce,
  • release artifacts cannot be independently audited.

Why This Exists

Most discovery pipelines fail at handoff time: data contracts remain implicit, transformations stay hidden, and quality claims are difficult to verify independently.

Provable Discovery Memory addresses this by treating discovery as a governed artifact lifecycle:

  1. canonical schemas define what each artifact means,
  2. adapters normalize heterogeneous backend outputs into shared contracts,
  3. formal emission links statements to machine-checkable obligations,
  4. deterministic kernels and benchmarks provide repeatable quality signals,
  5. publish manifests with checksums provide offline verifiability.

The goal is not just better model outputs.
The goal is reliable scientific memory: results that can be inspected, re-run, compared, and inherited.

Potential

If this architecture matures and achieves broad adoption, it can become a common interoperability layer for theory-guided discovery systems:

  • Backend interoperability: compare multiple discovery engines on the same canonical problems and metrics.
  • Reproducible science operations: move from ad-hoc experiments to repeatable, auditable release flows.
  • Formal traceability: tie confidence claims to explicit proof status rather than narrative summaries.
  • Cross-team portability: hand off artifacts across organizations without losing semantics.
  • Long-horizon benchmarking: track progress and regressions over time with stable contracts and thresholds.

This creates a path from promising one-off runs to cumulative scientific infrastructure.

Core Capabilities

  • Adapter interoperability: supports AI-Descartes import/export and additional backends (local_json, http_discovery).
  • Canonical contract enforcement: validates corpus and artifacts against JSON schemas.
  • Formal traceability: emits Lean obligations and tracks proof status.
  • Benchmark assurance: evaluates deterministic quality metrics against baseline thresholds.
  • Release governance: produces signed-off bundles/manifests with integrity checks.
  • Portal export: generates consumable views for downstream UI/reporting.

Architecture at a Glance

Backend output
  -> adapter normalization
  -> canonical artifacts (schema-validated)
  -> formal emission (proof obligations + Lean)
  -> benchmark report (threshold-gated)
  -> publish bundle + release manifest (checksums)
  -> portal export

Why We Need the Community

This project spans schemas, adapters, formal methods, benchmarking, and release operations. No single team can sustainably cover the full surface area of domain knowledge, backend diversity, and operational hardening.

Community participation is essential for:

  • More backend adapters: broaden real-world applicability beyond current integrations.
  • Richer formalization: extend supported statement grammars and proof workflows safely.
  • Stronger kernels and benchmarks: improve scientific signal quality and failure detection.
  • Domain-specific corpora: contribute canonical problem families with high-quality metadata.
  • Operational resilience: harden CI/release workflows and observability for production reliability.
  • Documentation quality: keep guidance accurate as architecture and capabilities evolve.

What We Ask from Contributors

  • Keep changes schema-first and contract-preserving.
  • Add deterministic tests for new behavior and edge cases.
  • Treat proof and benchmark claims as evidence-backed, not aspirational.
  • Prefer explicit failure over silent fallback.
  • Document assumptions, migration impacts, and operational implications.

Quick Start

Prerequisites

  • Python 3.11+
  • uv
  • Lean toolchain (elan, lake, lean)
  • Node 20+ and pnpm

Install

uv sync --extra dev --locked
pnpm --dir portal install --frozen-lockfile

Health and baseline checks

uv run pdm doctor
uv run pdm validate-corpus
uv run ruff check .
uv run mypy pipeline/src adapters kernels
uv run pytest -q
cd formal && lake build

End-to-End Workflows

1) AI-Descartes import/export

uv run pdm import-ai-descartes --src tests/fixtures/ai_descartes/export_golden/kepler_solar --problem-id kepler_solar
uv run pdm export-backend --problem-id kepler_solar --backend ai_descartes

2) Formal emission and benchmarking

uv run pdm emit-lean --problem-id kepler_solar
uv run pdm benchmark --problem-ids kepler_solar,langmuir_single_site

3) Publish and portal export

uv run pdm publish --problem-id kepler_solar
uv run pdm portal-export --out portal/public/portal_export.json
pnpm --dir portal build

For the complete operator runbook (including strict import mode, release checks, and troubleshooting), see:

Repository Structure

  • schemas/ - canonical artifact contracts
  • corpus/ - canonical problems and suites
  • adapters/ - backend integrations (ai_descartes, local_json, http_discovery)
  • pipeline/ - orchestration and CLI
  • formal/ - Lean core, generated obligations, and formal tooling
  • kernels/ - deterministic quality metrics
  • benchmarks/ - benchmark thresholds and benchmark documentation
  • portal/ - UI and exported data consumption
  • scripts/ - certification, governance, and operational checks
  • artifacts/ - generated run/release outputs (not source-of-truth code)

Quality and Release Gates

Primary quality and release gates include:

  • schema validation (pdm validate-corpus, schema checks),
  • type/lint/test checks (mypy, ruff, pytest),
  • Lean build and generated obligation checks,
  • benchmark threshold enforcement (benchmarks/baseline_thresholds.json),
  • release governance and checksum verification.

Relevant workflows:

Documentation

Contributing

Contributions are welcome across adapters, kernels, formalization, tests, and documentation.

Please read:

Expectations:

  • keep changes schema-first and contract-preserving,
  • include deterministic tests,
  • document behavioral or contract changes,
  • prefer explicit failures over silent fallback behavior.

Security

Please report vulnerabilities privately.

Code of Conduct

All contributors and maintainers are expected to follow:

License

This project is licensed under Apache-2.0.

Project Metadata

  • Package: provable-discovery-memory
  • Current version: 0.1.0 (from pyproject.toml)

About

Make scientific discovery results portable, verifiable, and durable across teams, toolchains, and time

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors