diff --git a/docs/refactor/NEXT_WORK_PLAN_2026-06.md b/docs/refactor/NEXT_WORK_PLAN_2026-06.md
index 5b5a362..ed6b3d4 100644
--- a/docs/refactor/NEXT_WORK_PLAN_2026-06.md
+++ b/docs/refactor/NEXT_WORK_PLAN_2026-06.md
@@ -125,6 +125,35 @@ Claude:
 No se ejecutó freeze ni submission. No se reabrió búsqueda, HPO, champion,
 intervalos conformal, validación conformal ni optimización portfolio.
 
+## Ejecución Codex (2026-06-14, puente teórico y QA visual)
+
+Claude dejó el commit `efb56d1` como reparación posterior a PR #73: el
+supplement ahora presenta A.1 antes de A.2, de modo que la proposición de
+optimalidad de Markov precede al tightening cluster-aware. Codex cerró los
+pendientes editoriales derivados de esa auditoría:
+
+- **Puente teórico en body y TEX.** Theorem 1, Proposition A.1 y Proposition
+  A.2 ahora se leen como tríptico: garantía principal bajo weighted funded-set
+  validity, optimalidad first-moment sin estructura adicional y sensibilidad
+  cluster-aware bajo independencia cross-cluster explícita.
+- **Defensa de temporalidad en A.2.** El supplement explica por qué la
+  partición period-grade es la defensa más razonable para un panel temporal:
+  separa cohortes de calendario mientras condiciona por grado; period-only
+  ignora mezcla de riesgo y grade-only cruza dependencia temporal.
+- **Menos repetición y menos "AI slop".** Se comprimió el menú de bounds para
+  que la frase "tightening requiere assumptions adicionales" aparezca como
+  pricing de supuestos, no como eco defensivo. Abstract, introducción,
+  contribuciones, resultados y conclusión quedaron más naturales y con el
+  claim estrecho: certificado conformal-robust post-hoc, no nuevo learner ni
+  live deployment.
+- **QA visual de submission.** Se regeneró `paper-submission`, se recompiló el
+  PDF IJDS y se inspeccionaron páginas clave en navegador local mediante vistas
+  PNG; en particular Fig. 13 y Fig. 14 siguen legibles en escala de grises, con
+  labels y colorbar distinguibles.
+
+No se ejecutó freeze ni submission. No se tocaron stages protegidos ni
+artefactos congelados del champion.
+
 ---
 
 ## AUDITORÍA POST-EJECUCIÓN (2026-06-13, Claude) — leer antes de continuar
diff --git a/paper/CRPTO_ijds.qmd b/paper/CRPTO_ijds.qmd
index 0024fc7..0af0abc 100644
--- a/paper/CRPTO_ijds.qmd
+++ b/paper/CRPTO_ijds.qmd
@@ -27,10 +27,10 @@ execute:
 
 Credit allocation is a data-science-for-decisions problem: calibrated default
 probabilities matter only after they shape which loans are funded. We introduce
-Conformal Robust Predict-Then-Optimize (CRPTO) as a reusable post-hoc decision
-audit that maps a frozen calibrated probability-of-default artifact through
-Mondrian conformal intervals into robust portfolio constraints and an exact
-empirical funded-set audit. On a 276,869-loan out-of-time Lending Club
+Conformal Robust Predict-Then-Optimize (CRPTO), a post-hoc bridge that maps a
+frozen calibrated probability-of-default artifact through Mondrian conformal
+intervals into robust portfolio constraints and an empirical funded-set audit.
+On a 276,869-loan out-of-time Lending Club
 evaluation, the promoted economic policy earns `$170.5K` on a `$1M` budget while
 passing the $\alpha = 0.01$ funded-set audit ($V(\alpha) = 0.028875$,
 $\Gamma_{\mathrm{CP}} = 0.187987$, zero violation). The final robust region
@@ -38,16 +38,13 @@ contains `45/45` alpha-safe policies across the evaluated risk, uncertainty, and
 aversion grid, indicating that the result is not a single-point artifact.
 External frozen replications on Prosper marketplace loans and Freddie/Mendeley
 single-family mortgages preserve the conformal gates and produce positive robust
-LP objectives, strengthening the claim that the CRPTO recipe is not merely a
-Lending-Club numerical artifact. Across these external panels the price of
-robustness is a positive premium that grows with the panel default rate (from
-`+1.0%` to `+9.5%`), turning replication into an economically interpretable
-stress test rather than a defensive checkmark. The contribution is an auditable
-conformal-robust credit-portfolio decision certificate: it connects real credit
-data, calibrated predictive models, robust funding decisions, and a drift
-harness that certifies the prediction-to-decision certificate chain regenerates
-bit-exactly from frozen artifacts, while keeping the statistical guarantee
-boundary explicit.
+LP objectives, so the result is not confined to the Lending Club panel.
+Across these external panels the price of robustness is a positive premium that
+grows with the panel default rate (from `+1.0%` to `+9.5%`). The contribution is
+a conformal-robust credit-portfolio decision certificate: it connects real
+credit data, calibrated predictive models, robust funding decisions, and a drift
+harness that regenerates the prediction-to-decision chain bit-exactly from
+frozen artifacts while keeping the statistical guarantee boundary explicit.
 
 **Keywords:** conformal prediction; robust optimization; predict-then-optimize;
 credit risk; portfolio optimization; reproducible data science.
@@ -70,21 +67,21 @@ conservative can pass every risk check while destroying economic value. The
 scientific question in this paper is therefore not whether one can build a
 slightly better credit classifier. It is whether finite-sample predictive
 uncertainty can be carried into a robust portfolio decision in a way that is
-transparent enough for a reviewer to audit. This framing is not merely
-rhetorical: in a pre-registered randomized trial, conformal prediction sets have
-been shown to measurably improve human decision making relative to fixed-size
-sets with the same coverage [@cresswell2024], which is exactly the
-committee-facing setting CRPTO targets.
+transparent enough for a reviewer to audit. This has practical stakes. In a
+pre-registered randomized trial, conformal prediction sets
+improved human decision making relative to fixed-size sets with the same
+coverage [@cresswell2024]. CRPTO takes that committee-facing idea into a credit
+portfolio setting, where the uncertainty summary must change a funding decision
+or it is just another report.
 
 CRPTO answers this question with a post-hoc, reproducible pipeline. It starts
 from a calibrated CatBoost PD model, constructs Mondrian conformal intervals
 over PD-scale predictions, and maps the upper conformal endpoint into robust
-portfolio constraints. The procedure is deliberately modular: the predictive
-model, conformal layer, optimization policy, and paper artifacts each have
-separate contracts. This modularity is a feature rather than an engineering
-accident. It lets the paper ask whether a frozen prediction system can be
-converted into a defendable decision system without reopening hyperparameter
-search whenever the manuscript or appendix changes.
+portfolio constraints. The pipeline is modular by design: the predictive model,
+conformal layer, optimization policy, and paper artifacts each have separate
+contracts. That separation lets the paper ask whether a frozen prediction
+system can be converted into a defendable decision system without reopening
+hyperparameter search whenever the manuscript or appendix changes.
 
 The empirical setting is the Lending Club retail-loan panel, with an
 out-of-time evaluation set of 276,869 loans. The promoted economic policy earns
@@ -100,28 +97,21 @@ out-of-sample and out-of-time splits. These replications are not new champions;
 they test whether the same PD-to-conformal-to-LP recipe remains economically
 usable on different credit products.
 
-The paper makes four contributions. First, it gives a reusable CRPTO
-construction for credit portfolios: frozen calibrated PD, Mondrian conformal
-uncertainty, and robust budgeted optimization as a post-hoc decision audit.
-Second, it positions that
-construction against the nearest decision literature: data-driven robust
-optimization, P2P lending portfolio models, conformal credit scoring, and
-decision-focused learning. The novelty claim is therefore specific, not
-leaderboard-oriented: CRPTO is an auditable conformal robust credit-portfolio
-decision built from frozen calibrated PD artifacts. Third, it provides an
-artifact-backed empirical study where every paper table and figure is generated
-from frozen outputs rather than manually transcribed summaries. Fourth, it adds
-external economic replications on Prosper and Freddie/Mendeley that separate the
-methodological claim from the idiosyncrasies of one P2P panel and reveal an
-economically interpretable pattern: under blind frozen application the price of
-robustness is a positive premium that grows with the panel default rate, while it
-is favorable only on the selected Lending Club champion. The key claim is
-deliberately surgical: CRPTO maps frozen calibrated PD artifacts into a robust
+The paper makes four contributions. First, it gives a CRPTO construction for
+credit portfolios: frozen calibrated PD, Mondrian conformal uncertainty, and
+robust budgeted optimization as a post-hoc decision audit. Second, it locates
+that construction relative to data-driven robust optimization, P2P lending
+portfolio models, conformal credit scoring, and decision-focused learning.
+Third, it provides an artifact-backed empirical study where every table and
+figure is generated from frozen outputs rather than manually transcribed
+summaries. Fourth, it adds external economic replications on Prosper and
+Freddie/Mendeley, separating the methodological claim from one P2P panel. The
+key claim is narrow: CRPTO maps frozen calibrated PD artifacts into a robust
 funded set, reports the portfolio-level conformal premium
-$\Gamma_{\mathrm{CP}}$, and verifies exact alpha-safe weighted miscoverage on the promoted Lending Club
-portfolio. The same conformal and LP gates remain viable on two additional credit
-datasets, and the paper documents the governance boundary between safe
-paper-facing reruns and protected stages that would change the promoted champion.
+$\Gamma_{\mathrm{CP}}$, and verifies exact alpha-safe weighted miscoverage on the
+promoted Lending Club portfolio. The same conformal and LP gates remain viable
+on two additional credit datasets, while the paper keeps the governance boundary
+between safe paper-facing reruns and protected champion-changing stages visible.
 
 Read as data science for decisions, the paper's four components are explicit.
 The data component is a static Lending Club OOT panel, with Prosper and
@@ -368,23 +358,16 @@ certified after the frozen selection.
 
 ![The bound claim stack separates deterministic accounting, the weighted-validity assumption, and the frozen exact certificate.](../reports/crpto/figures/crpto_fig20_bound_claim_layers.png){#fig-bound-claim-stack width="94%" fig-alt="Four-block bound claim stack separating conformal endpoint, deterministic identity, weighted validity assumption, and exact frozen certificate."}
 
-This theory is intentionally not presented as a universal dependence-free
-tightening for all adaptive credit policies. The Lending Club evaluation is
-temporal, and temporal dependence is handled empirically through strict
-out-of-time splits, temporal backtesting, and robustness appendices. The online
-supplement adds a cluster-aware conditional proposition: dependence may be
-arbitrary inside period or grade clusters, but sharper Hoeffding/Bernstein-style
-bounds require independence or conditional independence across those clusters
-[@hoeffding1963; @boucheron2013concentration].
-Markov therefore remains the main distribution-free claim, while the
-dependence-aware material is a transparent journal caveat rather than an
-overstated theorem.
-
-The compact validity ladder below is the paper's guardrail against
-overclaiming. CRPTO uses the first two levels as evidence, states weighted
-funded-set validity as the theorem's portfolio-level assumption, reports
-multi-distribution checks as diagnostics, and leaves online/live control for a
-new protocol.
+Dependence is handled conservatively rather than assumed away. The main bound
+does not require loan-level independence. Temporal structure is addressed by
+the out-of-time design and backtests; any sharper concentration argument is
+kept in the supplement, where the extra independence structure is stated
+explicitly.
+
+The compact validity ladder below fixes that boundary. CRPTO uses the first two
+levels as evidence, states weighted funded-set validity as the theorem's
+portfolio-level assumption, reports multi-distribution checks as diagnostics,
+and leaves online/live control for a new protocol.
 
 | Validity level | What it supports | CRPTO status |
 |---|---|---|
@@ -457,6 +440,19 @@ not the sharpest possible tail bound. The exact certificate in this paper is
 the empirical audit of the frozen selected policy, not a stronger
 post-selection conformal theorem.
 
+The theorem and the two supplement propositions should be read as one small
+triptych. Theorem 1 gives the paper's guarantee once weighted funded-set
+validity is accepted. Proposition A.1 shows that, without additional structure,
+Markov is not a placeholder for a missing second-moment bound; it is the sharp
+first-moment statement. Proposition A.2 then asks what extra structure would
+buy a tighter threshold. In this temporal credit panel the defensible version is
+cross-period or period-grade independence after the frozen recipe and allocation
+are fixed: within a period, grade, or period-grade cell, defaults and interval
+misses may remain dependent. The observed funded set is too exposure
+concentrated for that cluster argument to tighten the headline bound, which is
+why the body keeps Markov and the supplement reports the cluster calculation as
+a sensitivity check.
+
 # Experimental Design
 
 The empirical study uses Lending Club retail-loan data covering originations
@@ -599,7 +595,7 @@ It is worth anchoring the robustness cost in dollars rather than asserting it.
 The frozen `price_of_robustness` field is defined as the non-robust baseline's
 expected return minus the robust policy's expected return, both valued under
 point PDs on the same `$1M` budget. For the champion it is `-$14,465.69`
-(`-10.56%`); the negative sign is meaningful, not a typo. Under the point-PD
+(`-10.56%`); the negative sign is meaningful. Under the point-PD
 valuation the robust policy is not more conservative on paper than the non-robust
 baseline -- it is `$14,465.69` richer -- and the realized out-of-time return then
 lands higher still. The committee-facing reading is the three-number ladder
@@ -618,21 +614,20 @@ below, all from the same frozen champion record (`tau = 0.175`, `gamma = 0.45`,
 point-PD expected return over the non-robust baseline (hence the negative price
 of robustness), and the realized OOT return exceeds the robust expectation.
 
-The point is therefore not that robustness must be paid for in lost return. On
-this evaluation the conformal robust funded set is economically competitive with
--- in fact ahead of -- the non-robust baseline while additionally carrying the
-exact alpha-safe certificate. The value of reporting the ledger is auditability:
-a reviewer sees the baseline, the expectation, and the realized return on one
-budget instead of a single net figure that hides which way the trade-off went.
+On this evaluation, robustness is not a toll paid in lost return. The conformal
+robust funded set is ahead of the non-robust baseline under point-PD valuation
+and still carries the exact alpha-safe certificate. The ledger matters because
+it shows the direction of the trade-off: baseline expectation, robust
+expectation, and realized OOT return are all reported on the same `$1M` budget.
 
-The robust-region analysis is the strongest evidence that the result is not a
-single point artifact. Across the evaluated final region, `45/45` unique
+The robust-region analysis asks whether that result depends on one lucky
+hyperparameter setting. Across the evaluated final region, `45/45` unique
 policies pass the exact $\alpha = 0.01$ check. The 45 policies come from the
 cross-product of five risk-tolerance values, three uncertainty-blend values, and
 three uncertainty-aversion settings within the frozen bound-aware family. The
 selected policy is the economic champion inside that exact robust region, not
-merely the first feasible point. The supplement reports the full alpha/gamma
-funded-set table, robust-region heatmap, and policy-family appendix.
+the first feasible point. The supplement reports the full alpha/gamma funded-set
+table, robust-region heatmap, and policy-family appendix.
 
 The funded-set audit also matters because the bound is weighted by exposure,
 not counted by loan. The promoted portfolio funds 341 positive-exposure loan
@@ -668,9 +663,10 @@ supplement expands the same structure into artifact and guardrail references.
 
 ## Multi-Dataset External Economic Replication
 
-The strongest reviewer objection after the 276,869-loan Lending Club audit is
-generalization. The table below answers that objection without changing the champion:
-the same frozen recipe is applied to two external credit products. Prosper is a
+The natural generalization question after the 276,869-loan Lending Club audit is
+whether the recipe still works outside the champion panel. The table below
+answers that question without changing the champion: the same frozen recipe is
+applied to two external credit products. Prosper is a
 marketplace personal-loan panel with final statuses and a full OOT economic
 candidate universe. Freddie FM48 is a collateralized mortgage panel, using the
 48-month red+green default window with provided train/OOS/OOT structure. Both
@@ -700,17 +696,16 @@ The external layer also surfaces a result a single-dataset champion cannot show.
 The signed price of robustness--using the same convention as the Lending Club
 field, $(\text{nonrobust}-\text{robust})/\text{nonrobust}$--is a *positive*
 premium under frozen application, and it grows with the panel default rate
-(Table @tbl-price-of-robustness, Figure @fig-price-scaling). Within Freddie, the high-default `red` segment
-pays more than `green`; across datasets, Prosper's `30.92%` default panel pays the
-largest premium. The reading is economic, not incidental: higher default risk
+(Table @tbl-price-of-robustness, Figure @fig-price-scaling). Within Freddie, the
+high-default red segment pays more than green; across datasets, Prosper's
+`30.92%` default panel pays the largest premium. The reading is economic, not incidental: higher default risk
 widens the conformal intervals, so the robust worst case discounts more return,
 and discrimination (AUC) does not order the premium on its own. On the *selected*
 Lending Club champion the signed price is favorable (`-10.56%`), because the
 bound-aware search found a robust funded set that also wins expected return. The
-honest summary is that robustness is never economically catastrophic in these
-frozen applications: under blind application the conformal robust layer costs a
-single-digit to low-double-digit premium, and under selection it can be
-favorable.
+measured summary is more modest: in these frozen applications, the conformal
+robust layer is economically bounded. Under blind application it costs a
+single-digit to low-double-digit premium; under selection it can be favorable.
 That closes the external-replication claim at the right level: the recipe
 transfers as an economic audit protocol, while the exact funded-set certificate
 remains the Lending Club object.
@@ -804,9 +799,8 @@ budgeted funded set with a certified realized return (`$170,464.54` on the
 `$1M` budget) and the three verifiable risk controls (exact $\alpha$-safe pass,
 weighted-miscoverage audit, and `45/45` robust region). The two regret-trained
 comparators optimize a loss surface but do not emit an auditable funded-set
-certificate, so "higher regret" here is the cost of carrying conformal
-uncertainty into a different, synthetic decision, not evidence that CRPTO funds
-worse loans.
+certificate. The regret comparison is therefore about the synthetic benchmark
+task, not the quality of the funded loans in the `$1M` credit portfolio.
 
 ![The regret-auditability frontier shows the paper's trade-off in one panel: SPO+ is the low-regret corner, while CRPTO robust is the auditable-risk-control corner with all three verifiable checks passing.](../reports/crpto/figures/crpto_fig15_regret_auditability_frontier.png){#fig-regret-auditability width="72%" fig-alt="Scatter plot comparing two-stage, SPO+, and CRPTO robust by mean decision regret and number of verifiable risk-control checks passed."}
 
@@ -996,11 +990,11 @@ decision that a reviewer can audit end to end. On the Lending Club out-of-time
 panel the promoted policy earns `$170,464.54` on a `$1M` budget while passing the
 exact empirical $\alpha = 0.01$ funded-set audit, and it lies inside a `45/45`
 alpha-safe robust region rather than at a single lucky point. The external
-Prosper and Freddie/Mendeley replications show the recipe is not merely a
-Lending Club numerical artifact and expose an economically interpretable
-regularity: the price of robustness is a bounded premium that scales with the
-panel's default risk. The contribution is deliberately scoped---an auditable
-post-hoc decision certificate, not a new end-to-end learner or a live-deployment
-study---and every reported number is regenerable from frozen artifacts.
+Prosper and Freddie/Mendeley replications show that the recipe travels beyond
+the Lending Club panel and expose an economically interpretable regularity: the
+price of robustness is a bounded premium that scales with the panel's default
+risk. The contribution is scoped as an auditable post-hoc decision certificate,
+not a new end-to-end learner or a live-deployment study, and every reported
+number is regenerable from frozen artifacts.
 
 # References
diff --git a/paper/submission/CRPTO_ijds_submission.tex b/paper/submission/CRPTO_ijds_submission.tex
index 013a020..9e7ac78 100644
--- a/paper/submission/CRPTO_ijds_submission.tex
+++ b/paper/submission/CRPTO_ijds_submission.tex
@@ -66,26 +66,24 @@
 \ABSTRACT{%
 Credit allocation is a data-science-for-decisions problem: calibrated default
 probabilities matter only after they shape which loans are funded. We introduce
-Conformal Robust Predict-Then-Optimize (CRPTO) as a reusable post-hoc decision
-audit that maps a frozen calibrated probability-of-default artifact through
-Mondrian conformal intervals into robust portfolio constraints and an exact
-empirical funded-set audit. On a 276{,}869-loan out-of-time Lending Club
+Conformal Robust Predict-Then-Optimize (CRPTO), a post-hoc bridge that maps a
+frozen calibrated probability-of-default artifact through Mondrian conformal
+intervals into robust portfolio constraints and an empirical funded-set audit.
+On a 276{,}869-loan out-of-time Lending Club
 evaluation, the promoted economic policy earns \$170.5K on a \$1M budget while
 passing the $\alpha=0.01$ funded-set audit
 ($V(\alpha)=0.028875$, $\Gamma_{\mathrm{CP}}=0.187987$, zero violation). The final
 robust region contains $45/45$ alpha-safe policies across the evaluated risk,
 uncertainty, and aversion grid, indicating that the result is not a single-point
 artifact. External frozen replications on Prosper and Freddie/Mendeley preserve
-the conformal gates and produce positive robust LP objectives, strengthening the
-claim that the CRPTO recipe is not merely a Lending-Club numerical artifact.
+the conformal gates and produce positive robust LP objectives, so the result is
+not confined to the Lending Club panel.
 Across these external panels the price of robustness grows with the panel default
-rate (from $+1.0\%$ to $+9.5\%$), turning replication into an economically
-interpretable stress test rather than a defensive checkmark. The contribution is
-an auditable conformal-robust credit-portfolio decision certificate: it connects
-real credit data, calibrated predictive models, robust funding decisions, and
-a drift harness that certifies the prediction-to-decision certificate chain
-regenerates bit-exactly from frozen artifacts, while keeping the statistical
-guarantee boundary explicit.%
+rate (from $+1.0\%$ to $+9.5\%$). The contribution is a conformal-robust
+credit-portfolio decision certificate: it connects real credit data, calibrated
+predictive models, robust funding decisions, and a drift harness that regenerates
+the prediction-to-decision chain bit-exactly from frozen artifacts while keeping
+the statistical guarantee boundary explicit.%
 }
 
 \KEYWORDS{conformal prediction; robust optimization; predict-then-optimize;
@@ -112,20 +110,20 @@ \section{Introduction}\label{sec:intro}
 in this paper is therefore not whether one can build a slightly better credit
 classifier. It is whether finite-sample predictive uncertainty can be carried into
 a robust portfolio decision in a way that is transparent enough for a reviewer to
-audit. This framing is not merely rhetorical: in a pre-registered randomized
-trial, conformal prediction sets have been shown to measurably improve human
-decision making relative to fixed-size sets with the same coverage
-\citep{cresswell2024}, which is exactly the committee-facing setting CRPTO targets.
+audit. This has practical stakes. In a pre-registered randomized trial,
+conformal prediction sets improved human decision making relative to fixed-size
+sets with the same coverage \citep{cresswell2024}. CRPTO takes that
+committee-facing idea into a credit portfolio setting, where the uncertainty
+summary must change a funding decision or it is just another report.
 
 CRPTO answers this question with a post-hoc, reproducible pipeline. It starts from
 a calibrated CatBoost PD model, constructs Mondrian conformal intervals over
 PD-scale predictions, and maps the upper conformal endpoint into robust portfolio
-constraints. The procedure is deliberately modular: the predictive model,
-conformal layer, optimization policy, and paper artifacts each have separate
-contracts. This modularity is a feature rather than an engineering accident. It
-lets the paper ask whether a frozen prediction system can be converted into a
-defendable decision system without reopening hyperparameter search whenever the
-manuscript or appendix changes.
+constraints. The pipeline is modular by design: the predictive model, conformal
+layer, optimization policy, and paper artifacts each have separate contracts.
+That separation lets the paper ask whether a frozen prediction system can be
+converted into a defendable decision system without reopening hyperparameter
+search whenever the manuscript or appendix changes.
 
 The empirical setting is the Lending Club retail-loan panel, with an out-of-time
 evaluation set of 276{,}869 loans. The promoted economic policy earns
@@ -135,28 +133,22 @@ \section{Introduction}\label{sec:intro}
 allocation. It is a reproducible bridge from calibrated probabilistic learning to
 robust, auditable credit portfolio choice.
 
-The paper makes four contributions. First, it gives a reusable CRPTO construction
-for credit portfolios: frozen calibrated PD, Mondrian conformal uncertainty, and
-robust budgeted optimization as a post-hoc decision audit. Second, it positions
-that construction against the nearest decision literature: data-driven robust
-optimization, P2P lending portfolio models, conformal credit scoring, and
-decision-focused learning. The novelty claim is therefore specific, not
-leaderboard-oriented: CRPTO is an auditable conformal robust credit-portfolio
-decision built from frozen calibrated PD artifacts. Third, it provides an
-artifact-backed empirical study where every paper table and figure is generated
-from frozen outputs rather than manually transcribed summaries. Fourth, it adds
-external economic replications on Prosper and Freddie/Mendeley that separate the
-methodological claim from one P2P panel and reveal an economically interpretable
-pattern: under blind frozen application the price of robustness is a positive
-premium that grows with the panel default rate, while it is favorable only on the
-selected Lending Club champion. The key claim is
-deliberately surgical: CRPTO maps frozen calibrated PD artifacts into a robust
+The paper makes four contributions. First, it gives a CRPTO construction for
+credit portfolios: frozen calibrated PD, Mondrian conformal uncertainty, and
+robust budgeted optimization as a post-hoc decision audit. Second, it locates
+that construction relative to data-driven robust optimization, P2P lending
+portfolio models, conformal credit scoring, and decision-focused learning.
+Third, it provides an artifact-backed empirical study where every table and
+figure is generated from frozen outputs rather than manually transcribed
+summaries. Fourth, it adds external economic replications on Prosper and
+Freddie/Mendeley, separating the methodological claim from one P2P panel. The
+key claim is narrow: CRPTO maps frozen calibrated PD artifacts into a robust
 funded set, reports the portfolio-level conformal premium $\Gamma_{\mathrm{CP}}$,
 and verifies exact alpha-safe weighted miscoverage on the promoted Lending Club
-portfolio.
-The same conformal and LP gates remain viable on two additional credit datasets,
-while the governance boundary remains explicit: paper-facing reruns consume frozen
-artifacts, whereas protected stages would change the promoted champion.
+portfolio. The same conformal and LP gates remain viable on two additional
+credit datasets, while the governance boundary remains explicit: paper-facing
+reruns consume frozen artifacts, whereas protected stages would change the
+promoted champion.
 
 Read as data science for decisions, the paper's four components are explicit:
 the data are a static Lending Club OOT panel plus Prosper/Freddie external stress
@@ -449,15 +441,19 @@ \section{Theory}\label{sec:theory}
 not a stronger post-selection conformal theorem.
 \end{remark}
 
-This theory is intentionally not presented as a universal dependence-free tightening
-for all adaptive credit policies. The Lending Club evaluation is temporal, and
-temporal dependence is handled empirically through strict out-of-time splits, temporal
-backtesting, and robustness appendices. The online supplement adds a cluster-aware
-conditional proposition: dependence may be arbitrary inside period or grade clusters,
-but sharper Hoeffding/Bernstein-style bounds require independence or conditional
-independence across those clusters \citep{hoeffding1963,boucheron2013concentration}. Markov therefore remains the main
-distribution-free claim, while the dependence-aware material is a transparent journal
-caveat rather than an overstated theorem.
+The theorem and the two supplement propositions should be read as one small
+triptych. Theorem~\ref{thm:funded-set-bound} gives the paper's guarantee once
+weighted funded-set validity is accepted. Supplement Proposition~A.1 shows
+that, without additional structure, Markov is not a placeholder for a missing
+second-moment bound; it is the sharp first-moment statement. Supplement
+Proposition~A.2 then asks what extra structure would buy a tighter threshold.
+In this temporal credit panel the defensible version is cross-period or
+period-grade independence after the frozen recipe and allocation are fixed:
+within a period, grade, or period-grade cell, defaults and interval misses may
+remain dependent. The observed funded set is too exposure concentrated for that
+cluster argument to tighten the headline bound, which is why the body keeps
+Markov and the supplement reports the cluster calculation as a sensitivity
+check.
 
 % =====================================================================
 \section{Experimental Design}\label{sec:design}
@@ -649,20 +645,19 @@ \section{Results}\label{sec:results}
   }%
 \end{table}
 
-The point is therefore not that robustness must be paid for in lost return. On this
-evaluation the conformal robust funded set is economically competitive with---in fact
-ahead of---the non-robust baseline while additionally carrying the exact alpha-safe
-certificate. The value of reporting the ledger is auditability: a reviewer sees the
-baseline, the expectation, and the realized return on one budget instead of a single
-net figure that hides which way the trade-off went.
-
-The robust-region analysis is the strongest evidence that the result is not a single
-point artifact. Across the evaluated final region, $45/45$ unique policies pass the
-exact $\alpha=0.01$ check. The 45 policies come from the cross-product of five
-risk-tolerance values, three uncertainty-blend values, and three
+On this evaluation, robustness is not a toll paid in lost return. The conformal
+robust funded set is ahead of the non-robust baseline under point-PD valuation and
+still carries the exact alpha-safe certificate. The ledger matters because it shows
+the direction of the trade-off: baseline expectation, robust expectation, and
+realized OOT return are all reported on the same \$1M budget.
+
+The robust-region analysis asks whether that result depends on one lucky
+hyperparameter setting. Across the evaluated final region, $45/45$ unique policies
+pass the exact $\alpha=0.01$ check. The 45 policies come from the cross-product of
+five risk-tolerance values, three uncertainty-blend values, and three
 uncertainty-aversion settings within the frozen bound-aware family. The selected
-policy is the economic champion inside that exact robust region, not merely the
-first feasible point. The supplement reports the full alpha/gamma funded-set table,
+policy is the economic champion inside that exact robust region, not the first
+feasible point. The supplement reports the full alpha/gamma funded-set table,
 robust-region heatmap, and policy-family appendix.
 
 \begin{table}[t]
@@ -705,9 +700,10 @@ \section{Results}\label{sec:results}
 
 \subsection{Multi-Dataset External Economic Replication}
 
-The strongest reviewer objection after the Lending Club audit is generalization.
-Table~\ref{tab:external-replication} answers that objection without changing the
-champion: the same frozen recipe is applied to two external credit products.
+The natural generalization question after the Lending Club audit is whether the
+recipe still works outside the champion panel. Table~\ref{tab:external-replication}
+answers that question without changing the champion: the same frozen recipe is
+applied to two external credit products.
 Both pass the conformal gates and both return positive robust LP objectives.
 
 \begin{table}[t]
@@ -739,16 +735,16 @@ \subsection{Multi-Dataset External Economic Replication}
 field, $(\mathrm{nonrobust}-\mathrm{robust})/\mathrm{nonrobust}$---is a
 \emph{positive} premium under frozen application, and it grows with the panel
 default rate (Table~\ref{tab:price-of-robustness}, Figure~\ref{fig:price-scaling}). Within Freddie, the
-high-default \texttt{red} segment pays more than \texttt{green}; across datasets,
+high-default red segment pays more than green; across datasets,
 Prosper's $30.92\%$ default panel pays the largest premium. The reading is
 economic, not incidental: higher default risk widens the conformal intervals, so
 the robust worst case discounts more return, and discrimination (AUC) does not
 order the premium on its own. On the \emph{selected} Lending Club champion the
 signed price is favorable ($-10.56\%$), because the bound-aware search found a
-robust funded set that also wins expected return. The honest summary is that
-robustness is never economically catastrophic in these frozen applications: under
-blind application the conformal robust layer costs a single-digit to
-low-double-digit premium, and under selection it can be favorable.
+robust funded set that also wins expected return. The measured summary is more
+modest: in these frozen applications, the conformal robust layer is economically
+bounded. Under blind application it costs a single-digit to low-double-digit
+premium; under selection it can be favorable.
 
 \begin{table}[t]
   \centering
@@ -851,9 +847,9 @@ \subsection{Regret-Auditability Frontier}\label{sec:regret}
 measurements (a real \$1M funded set versus a synthetic regret benchmark). The
 right-hand columns report what the credit decision actually delivers: only CRPTO
 produces a budgeted funded set with a certified realized return (\$170{,}464.54 on
-the \$1M budget) and the three verifiable risk controls. ``Higher regret'' is
-therefore the cost of carrying conformal uncertainty into a different, synthetic
-decision, not evidence that CRPTO funds worse loans.
+the \$1M budget) and the three verifiable risk controls. The regret comparison is
+therefore about the synthetic benchmark task, not the quality of the funded loans
+in the \$1M credit portfolio.
 
 \begin{figure}[t]
   \centering
@@ -991,12 +987,12 @@ \section{Conclusion}\label{sec:conclusion}
 promoted policy earns \$170{,}464.54 on a \$1M budget while passing the exact
 empirical $\alpha=0.01$ funded-set audit, and it lies inside a $45/45$
 alpha-safe robust region rather than at a single lucky point. The external
-Prosper and Freddie/Mendeley replications show the recipe is not merely a Lending
-Club numerical artifact and expose an economically interpretable regularity: the
-price of robustness is a bounded premium that scales with the panel's default
-risk. The contribution is deliberately scoped---an auditable post-hoc decision
-certificate, not a new end-to-end learner or a live-deployment study---and every
-reported number is regenerable from frozen artifacts.
+Prosper and Freddie/Mendeley replications show that the recipe travels beyond the
+Lending Club panel and expose an economically interpretable regularity: the price
+of robustness is a bounded premium that scales with the panel's default risk. The
+contribution is scoped as an auditable post-hoc decision certificate, not a new
+end-to-end learner or a live-deployment study, and every reported number is
+regenerable from frozen artifacts.
 
 % Reproducibility/companion disclosure is kept for the cover letter / non-anonymous
 % version, not the double-anonymous body.
diff --git a/paper/supplement_ijds.qmd b/paper/supplement_ijds.qmd
index 818a7ed..51d1025 100644
--- a/paper/supplement_ijds.qmd
+++ b/paper/supplement_ijds.qmd
@@ -140,19 +140,14 @@ and the choice $t = \sqrt{\alpha}$ gives the body statement:
 miscoverage budget $\sqrt{\alpha}$ exceeded with probability at most
 $\sqrt{\alpha}$. $\blacksquare$
 
-Two boundaries are worth restating. First, Assumption 1 is where the
-adaptive-selection risk lives: Markov is applied to a quantity whose mean
-control is assumed under the funded-set weights, not derived from marginal
-split conformal alone; the empirical exact audit is the after-the-fact check
-of that assumption on the frozen selection. Second, the bound is
-deliberately first-moment only; the cluster-aware tightenings below sharpen
-it strictly under additional cross-cluster independence assumptions.
-
-Hoeffding/Bernstein-style tightenings are deliberately secondary. They are
-reported only under additional conditional-independence assumptions because the
-Lending Club evaluation is temporal and the funded set shares calibration and
-selection history. The paper therefore keeps Markov as the main claim and uses
-the tightening appendix as sensitivity evidence, not as a stronger theorem.
+These steps leave a clean split. Theorem 1 proves the operational bound under
+Assumption 1. Proposition A.1 below shows that Markov is the sharp
+first-moment statement when no additional structure is asserted. Proposition
+A.2 asks the next natural question: if the funded set is grouped by period,
+grade, or period-grade, and dependence is allowed inside each group, what would
+independence across groups buy? That cross-cluster assumption is plausible only
+as an additional sensitivity condition, not as a theorem premise quietly added
+to the body.
 
 The phrase "exact funded-set certificate" has a narrow meaning throughout the
 paper. It is an exact accounting audit on the frozen out-of-time funded set:
@@ -173,66 +168,7 @@ transcribed table.
 
 : Certificate taxonomy used in the paper.
 
-## Cluster-Aware Conditional Tightening
-
-Let clusters $g = 1,\ldots,G$ represent period, grade, or period-grade cells,
-and define
-
-$$
-Z_g(\alpha)=\sum_{i\in g} w_i\mathbf{1}\{Y_i>u_i(\alpha)\},\qquad
-W_g=\sum_{i\in g}w_i .
-$$
-
-Within each cluster, defaults and conformal misses may be arbitrarily
-dependent. The useful structure, if one is willing to assert it, is
-cross-cluster independence after conditioning on the calibration sample and the
-fixed funded allocation.
-
-**Proposition A.2 (cluster-aware Hoeffding under cross-cluster independence).**
-Let $\mathcal F$ contain the calibration sample, the frozen conformal recipe,
-the declared cluster partition, and the selected funded allocation. Suppose
-that, conditional on $\mathcal F$, the cluster aggregates
-$Z_1(\alpha),\ldots,Z_G(\alpha)$ are independent, satisfy
-$0\le Z_g(\alpha)\le W_g$, and obey conditional weighted validity
-$\sum_g E[Z_g(\alpha)\mid\mathcal F]\le\alpha$ (for example, it is sufficient
-that $E[Z_g(\alpha)\mid\mathcal F]\le\alpha W_g$ for every cluster). Then, for
-every $\delta\in(0,1)$,
-
-$$
-P\!\left(
-  V(\alpha)\ge
-  \alpha + \sqrt{\frac{1}{2}\left(\sum_g W_g^2\right)\log\frac{1}{\delta}}
-  \;\middle|\;\mathcal F
-\right)\le\delta .
-$$
-
-*Proof.* Let $\mu=\sum_g E[Z_g(\alpha)\mid\mathcal F]\le\alpha$ and
-$S_2=\sum_g W_g^2$. Hoeffding's inequality for independent bounded summands
-gives
-
-$$
-P\{V(\alpha)-\mu\ge s\mid\mathcal F\}\le \exp(-2s^2/S_2).
-$$
-
-Taking $s=\sqrt{S_2\log(1/\delta)/2}$ and using $\mu\le\alpha$ gives the
-displayed bound. Integrating over $\mathcal F$ gives the same unconditional
-statement. $\blacksquare$
-
-Proposition A.2 is therefore the natural complement to Proposition A.1. Under
-Assumption 1 alone, A.1 shows why Markov is the sharp distribution-free claim;
-under an explicit cross-cluster structure, A.2 shows exactly when a
-Hoeffding-style tightening becomes available [@hoeffding1963;
-@boucheron2013concentration]. At the paper level $\alpha=0.01$ with matched
-tail probability $\delta=\sqrt{\alpha}=0.10$, the cluster-aware threshold is
-tighter than Markov only if $\sum_g W_g^2<0.0070$. The frozen funded set is much
-more concentrated: period, grade, and period-grade partitions have
-$\sum_g W_g^2=0.2407$, $0.3572$, and $0.0914$, respectively, so the corresponding
-thresholds are `0.5365`, `0.6512`, and `0.3344`, all looser than Markov's
-`0.1000`. This proposition does not replace the main theorem; it names the
-extra structure a reviewer would have to accept and makes the empirical
-concentration cost transparent in A21.
-
-### How much does the distribution-free bound leave on the table?
+## Sharpness of the Distribution-Free Bound
 
 To quantify the cost of staying distribution-free, the table below contrasts the
 Markov threshold used in the main theorem with concentration tightenings computed
@@ -290,29 +226,88 @@ asserts. Every sharper row in the menu prices a specific additional assumption
 (loan independence, conditional variance, or a martingale protocol), and the
 empirical `V = 0.028875` clears all of them anyway.
 
-Bennett is the closest match to the finite funded-set calculation because it was
-designed for independent, non-identically distributed summands using only the
-variance of the sum and component bounds. Freedman is included only as the
+Bennett is the closest match to the finite funded-set calculation because it
+was designed for independent, non-identically distributed summands using only
+the variance of the sum and component bounds. Freedman is included only as the
 martingale analogue of Bernstein: it would become relevant under a pre-declared
-sequential validation protocol with bounded increments and a conditional-variance
-process, which is stronger than the frozen replay used here.
-
-Two honest readings follow. First, the tightenings are real: a second-moment or
-one-sided variance argument can cut the worst-case threshold by roughly
-15--35\% at `alpha = 0.01`. Second, they are not free: they require loan-level
-independence, a sealed martingale protocol, or variance control stronger than
-weighted funded-set validity alone. We therefore keep Markov as the stated
-guarantee and report these tables only as sensitivity bounds -- they show what
-sharper concentration *would* deliver under assumptions we decline to assert, not
-a tighter claim about the promoted policy. Chebyshev is omitted because
-one-sided Cantelli dominates it for this event; Azuma is omitted because it
-duplicates Hoeffding numerically while adding a sequential protocol assumption;
-Chernoff is omitted because its sharp threshold requires individual miss
-probabilities bounded by `alpha`; and a naive union-Markov correction over the 45
-final robust-region policies is vacuous at the paper alphas. The tables are
-regenerated by `scripts/build_concentration_bound_table.py` and
+sequential validation protocol with bounded increments and a
+conditional-variance process, which is stronger than the frozen replay used
+here.
+
+The table's role is assumption pricing. The sharper rows show what a reviewer
+would gain by accepting independence, variance, or martingale structure; they
+are not promoted because the body theorem asserts only Assumption 1. Chebyshev
+is omitted because one-sided Cantelli dominates it for this event; Azuma is
+omitted because it duplicates Hoeffding numerically while adding a sequential
+protocol assumption; Chernoff is omitted because its sharp threshold requires
+individual miss probabilities bounded by `alpha`; and a naive union-Markov
+correction over the 45 final robust-region policies is vacuous at the paper
+alphas. The tables are regenerated by
+`scripts/build_concentration_bound_table.py` and
 `scripts/build_bound_tightening_audit.py` from frozen funded-set weights.
 
+## Cluster-Aware Conditional Tightening
+
+Let clusters $g = 1,\ldots,G$ represent period, grade, or period-grade cells,
+and define
+
+$$
+Z_g(\alpha)=\sum_{i\in g} w_i\mathbf{1}\{Y_i>u_i(\alpha)\},\qquad
+W_g=\sum_{i\in g}w_i .
+$$
+
+Within each cluster, defaults and conformal misses may be arbitrarily
+dependent. The useful structure, if one is willing to assert it, is
+cross-cluster independence after conditioning on the calibration sample and the
+fixed funded allocation. Among the three reported partitions, period-grade is
+the most defensible compromise for a temporal credit panel: it separates
+calendar cohorts while conditioning on risk grade. Period alone ignores grade
+mix, and grade alone cuts across calendar dependence.
+
+**Proposition A.2 (cluster-aware Hoeffding under cross-cluster independence).**
+Let $\mathcal F$ contain the calibration sample, the frozen conformal recipe,
+the declared cluster partition, and the selected funded allocation. Suppose
+that, conditional on $\mathcal F$, the cluster aggregates
+$Z_1(\alpha),\ldots,Z_G(\alpha)$ are independent, satisfy
+$0\le Z_g(\alpha)\le W_g$, and obey conditional weighted validity
+$\sum_g E[Z_g(\alpha)\mid\mathcal F]\le\alpha$ (for example, it is sufficient
+that $E[Z_g(\alpha)\mid\mathcal F]\le\alpha W_g$ for every cluster). Then, for
+every $\delta\in(0,1)$,
+
+$$
+P\!\left(
+  V(\alpha)\ge
+  \alpha + \sqrt{\frac{1}{2}\left(\sum_g W_g^2\right)\log\frac{1}{\delta}}
+  \;\middle|\;\mathcal F
+\right)\le\delta .
+$$
+
+*Proof.* Let $\mu=\sum_g E[Z_g(\alpha)\mid\mathcal F]\le\alpha$ and
+$S_2=\sum_g W_g^2$. Hoeffding's inequality for independent bounded summands
+gives
+
+$$
+P\{V(\alpha)-\mu\ge s\mid\mathcal F\}\le \exp(-2s^2/S_2).
+$$
+
+Taking $s=\sqrt{S_2\log(1/\delta)/2}$ and using $\mu\le\alpha$ gives the
+displayed bound. Integrating over $\mathcal F$ gives the same unconditional
+statement. $\blacksquare$
+
+Proposition A.2 is therefore the natural complement to Proposition A.1. Under
+Assumption 1 alone, A.1 shows why Markov is the sharp distribution-free claim;
+under an explicit cross-cluster structure, A.2 shows exactly when a
+Hoeffding-style tightening becomes available [@hoeffding1963;
+@boucheron2013concentration]. At the paper level $\alpha=0.01$ with matched
+tail probability $\delta=\sqrt{\alpha}=0.10$, the cluster-aware threshold is
+tighter than Markov only if $\sum_g W_g^2<0.0070$. The frozen funded set is much
+more concentrated: period, grade, and period-grade partitions have
+$\sum_g W_g^2=0.2407$, $0.3572$, and $0.0914$, respectively, so the corresponding
+thresholds are `0.5365`, `0.6512`, and `0.3344`, all looser than Markov's
+`0.1000`. This proposition does not replace the main theorem; it names the
+extra structure a reviewer would have to accept and makes the empirical
+concentration cost transparent in A21.
+
 # Appendix B: P1 Evidence
 
 The P1 package strengthens the frozen champion without reopening search. Tables
@@ -476,17 +471,17 @@ The source CSV is
 ![A34 price-of-robustness scaling: under frozen application the premium is positive and increases with the panel default rate, while the selected Lending Club champion (`-10.56%`) is a favorable reference below zero.](../reports/crpto/figures/crpto_fig25_price_of_robustness_scaling.png){#fig-supp-price-scaling width="82%" fig-alt="Line chart on a log-scale x-axis showing the price of robustness rising from +1.00 percent to +9.46 percent as the panel default rate increases, with Lending Club at -10.56 percent as a reference line."}
 
 Two readings matter. First, the premium tracks irreducible default risk, not
-discrimination: the `green` and `red` Freddie segments have nearly identical AUC
-but different premiums, while their default rates differ by roughly a factor of
-five. Higher default risk widens the conformal intervals, so the robust worst
-case discounts more economic return. Second, the *selected* Lending Club champion
+discrimination: the green and red Freddie segments have nearly identical AUC but
+different premiums, while their default rates differ by roughly a factor of five.
+Higher default risk widens the conformal intervals, so the robust worst case
+discounts more economic return. Second, the *selected* Lending Club champion
 has a favorable signed price (`-10.56%`): bound-aware search located a robust
 funded set that also wins expected return. Reporting both--a bounded positive
 premium under blind application and a favorable value under selection--is more
 defensible than claiming robustness is uniformly free or uniformly costly. The
-headline is that robustness is never economically catastrophic in these frozen
-applications: the conformal robust layer costs at most a low-double-digit
-premium, and CRPTO measures which regime a given panel is in.
+measured headline is narrower: in these frozen applications, the conformal
+robust layer costs at most a low-double-digit premium, and CRPTO measures which
+regime a given panel is in.
 
 ## Reviewer Claim Checks