Problem
Automated experimentation can be valuable later, but it must not enter the product core before feedback, snapshots, calibration analytics, and dataset export are stable. Research should stay offline, reproducible, and review-gated.
User-facing flow
- Calibration dataset export produces reproducible JSONL/CSV records.
- Offline scripts run controlled parameter sweeps or model comparisons.
- Backtests produce validation reports.
- Human review decides whether deterministic formulas, thresholds, or recommendation rules should change.
- Accepted changes are versioned and documented before they affect production outputs.
Definition of done
- Sandbox runs outside production decision paths.
- Experiments are reproducible and tied to exported dataset versions.
- Reports compare model versions against subjective outcomes and data quality.
- No LLM or automated research loop changes readiness, recommendations, or calibration without review.
Linked existing issues
Migration note
Created from the accepted end-to-end epic proposal in docs/product/END_TO_END_EPICS_PROPOSAL.md. This epic intentionally keeps speculative ML and automated experimentation separate from the deterministic production core.
Problem
Automated experimentation can be valuable later, but it must not enter the product core before feedback, snapshots, calibration analytics, and dataset export are stable. Research should stay offline, reproducible, and review-gated.
User-facing flow
Definition of done
Linked existing issues
Migration note
Created from the accepted end-to-end epic proposal in
docs/product/END_TO_END_EPICS_PROPOSAL.md. This epic intentionally keeps speculative ML and automated experimentation separate from the deterministic production core.