diff --git a/.gitignore b/.gitignore
index b22c8944..1c1dd2a3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -219,3 +219,6 @@ uv.lock
 .dev.vars*
 !.dev.vars.example
 !.env.example
+
+# macOS
+.DS_Store
diff --git a/CLAUDE.md b/CLAUDE.md
index 50232314..1a5473d8 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -67,11 +67,16 @@ src/lightcone/              # namespace — NO __init__.py
     ├── harness.py, sandbox.py, graders.py, build.py, report.py, models.py
 
 claude/lightcone/           # Claude plugin source — force-included into the wheel
-├── skills/                 # lc-new, lc-build, lc-verify, lc-migrate, lc-feedback
+├── skills/                 # lc-new, lc-from-code, lc-from-paper,
+│                            # lc-feedback, ralph;
+│                            # paper-reproduction bundle: lc-from-paper (entry),
+│                            # ralph (loop substrate), narrative,
+│                            # paper-extraction, figure-comparison,
+│                            # check-sentence-by-sentence
+│                            # (see skills/README.md for the full bundle map)
 ├── agents/                 # lc-extractor
-├── guides/                 # astra-reference, lightcone-cli-reference, ui-brand
 ├── templates/              # Project CLAUDE.md template
-└── scripts/                # Session hooks (bash): venv activation, validate-on-save, status display
+└── scripts/                # Session hooks (bash): venv activation, validate-on-save, session-start primer
 
 tests/                      # pytest — mirrors src/ structure
 pyproject.toml              # hatchling + hatch-vcs, ASTRA + Snakemake as deps
diff --git a/README.md b/README.md
index b4c0db3d..7145b1aa 100644
--- a/README.md
+++ b/README.md
@@ -18,10 +18,12 @@ cd my-analysis
 claude
 ```
 
-Then tell the agent `/lc-new` to scope your research question. After the spec exists, just tell the agent to build it — implementation is a normal Claude Code workflow guided by `.claude/guides/`.
+Then tell the agent what you have to start from — a research question (`/lc-new`), existing code (`/lc-from-code`), or a paper to reproduce (`/lc-from-paper`). After the spec exists, work with the agent however suits you; the substrate (`astra.yaml`, `lc run`, `lc status`, `lc verify`) keeps things in sync.
 
 ## Skills
 
+The `/lc-from-*` family is parallel by what you start from: a question, code, or a paper.
+
 ### `/lc-new` — Scope and specify an analysis
 
 Guides you from a research question to a complete `astra.yaml` specification through interactive conversation. The agent will:
@@ -34,17 +36,21 @@ Guides you from a research question to a complete `astra.yaml` specification thr
 
 You don't write any code or YAML during this phase — the agent produces the full specification.
 
-### `/lc-migrate` — Bring an existing project into ASTRA
+### `/lc-from-code` — Bring an existing project into ASTRA
 
 Scans an existing codebase, drafts an `astra.yaml` that captures its inputs, outputs, and analytical decisions, parameterizes the code so decisions can vary across universes, and runs the analysis through `lc` until every output materializes. Existing logic is left intact — changes are confined to parameter plumbing.
 
+### `/lc-from-paper` — Reproduce a published paper
+
+ORIENT-first driver for reproducing a published paper in ASTRA. ORIENT runs in the user's main session in seven stages — asks for the paper, runs `/paper-extraction` inline to acquire it, interviews the user (grounded in the paper), clones the reference code and runs `/lc-from-code` scan-only (if a repo exists), optionally follows up, then drafts a per-paper `constitution.md` (the ralph loop's driving document) + `CLAUDE.md` (auto-loading rules + accumulators) from the full paper-plus-code context for user review. Then the rest of the reproduction hands off to a **ralph loop** whose iterations carry the long middle: ARCHITECT → SPECIFY → LITERATURE → IMPLEMENT → RUN → COMPARE. Each iteration runs in a fresh tmux session against the constitution; the fresh-context property between iterations is what makes per-phase review work. When the loop closes (constitution `status: closed` after COMPARE returns `pass`), REVIEW runs back in the user's main session. Composes a bundle of sibling skills (`ralph`, `paper-extraction`, `narrative`, `figure-comparison`, `check-sentence-by-sentence`). See [`claude/lightcone/skills/README.md`](claude/lightcone/skills/README.md) for the full bundle map.
+
 ### `/lc-feedback` — Report a bug
 
 Files a GitHub issue against the right repo (ASTRA or lightcone-cli) with version info and error context auto-collected from your session.
 
 ### Building and verifying
 
-There is no `/lc-build` or `/lc-verify` skill — building and verifying are part of the normal Claude Code workflow once `astra.yaml` exists. The agent reads `.claude/guides/lightcone-cli-reference.md` (workflow, commands, status meanings) and `.claude/guides/astra-reference.md` (spec syntax) and drives the build directly: write scripts under `src/`, run `lc run`, watch `lc status` until every output is `ok`, then `astra validate astra.yaml` and `lc verify` to confirm the spec is valid and the provenance chain is intact.
+Once `astra.yaml` exists, you (or the agent) build it however suits you. The typical flow is `lc run` to materialize outputs, `lc status` to track progress, `astra validate astra.yaml` for spec validity, and `lc verify` for provenance integrity — agent-driven, ralph-looped, or hand-written, the `lc` substrate stays in sync.
 
 ## CLI Reference
 
@@ -52,8 +58,6 @@ There is no `/lc-build` or `/lc-verify` skill — building and verifying are par
 
 The first `lc` invocation auto-creates `~/.lightcone/config.yaml` with `container.runtime: auto`. To pin a runtime or change other settings, edit the file directly.
 
-**Extraction model:** Literature extraction subagents default to Sonnet. To change this, set `extraction_model:` in `~/.lightcone/config.yaml` (options: `sonnet`, `haiku`, or omit for inherit).
-
 ### Project scaffolding
 
 ```bash
diff --git a/claude/lightcone/scripts/session-start.sh b/claude/lightcone/scripts/session-start.sh
index 23b812ec..121f7c9f 100755
--- a/claude/lightcone/scripts/session-start.sh
+++ b/claude/lightcone/scripts/session-start.sh
@@ -1,11 +1,12 @@
 #!/bin/bash
 # SessionStart hook: surface a terse project status to the agent.
 #
-# Reports validation status, materialization counts, and pointers to the
-# canonical reference docs. Project name / decision count / universe count
-# are intentionally omitted -- they are trivia the agent reads from
-# astra.yaml and CLAUDE.md when needed, and they cost against the 10k
-# additionalContext budget.
+# Reports validation status, materialization counts, and a tight CLI
+# primer so the agent knows what substrate commands exist and which
+# reference skills carry the depth. Project name / decision count /
+# universe count are intentionally omitted -- they are trivia the agent
+# reads from astra.yaml and CLAUDE.md when needed, and they cost against
+# the 10k additionalContext budget.
 
 input=$(cat)
 cwd=$(echo "$input" | jq -r '.cwd // empty')
@@ -48,7 +49,13 @@ fi
 summary="$summary
 Materialization: ok=$ok_count stale=$stale_count missing=$missing_count alias=$alias_count
 
-References: .claude/guides/astra-reference.md (spec) and .claude/guides/lightcone-cli-reference.md (CLI)."
+Substrate CLIs (use --help on any):
+  lc init / lc run / lc status / lc verify / lc build / lc export wrroc
+  astra validate / astra paper add / astra universe generate
+
+Reference skills (invoke when the surface above isn't enough):
+  /astra   — astra.yaml spec: decisions, prior_insights, findings, evidence, sub-analyses, narrative anchors
+  /lc-cli  — lc workflow: spec-code invariant, status interpretation, failure diagnosis"
 
 if [ "$validation_ok" -ne 0 ]; then
     # tail rather than head -- the leading lines are success markers
diff --git a/claude/lightcone/skills/README.md b/claude/lightcone/skills/README.md
new file mode 100644
index 00000000..6d3a238d
--- /dev/null
+++ b/claude/lightcone/skills/README.md
@@ -0,0 +1,43 @@
+# lightcone-cli skills
+
+Each subdirectory is one Claude Code skill: `SKILL.md` plus optional `references/`, `assets/`, and `scripts/`. `lc init` copies these into a project's `.claude/skills/` so they are discoverable to Claude Code sessions.
+
+## Project lifecycle skills
+
+| Skill | Role |
+|---|---|
+| `lc-new` | Scaffold a new ASTRA-shaped project from a research question. |
+| `lc-from-code` | Bring an existing codebase into ASTRA — scan, spec, parameterize. |
+| `lc-from-paper` | Reproduce a published paper in ASTRA (paper-reproduction bundle entry point — see below). |
+| `lc-feedback` | Report bugs and feature requests upstream. |
+| `ralph` | Author a constitution and run a ralph loop against it (authoring + launching + iterating in one skill). `lc-from-paper` uses this for the long middle of a reproduction; standalone for any other long-running work. |
+
+## Reference skills
+
+Not direct entry points — Claude invokes these (or other skills invoke them) to load reference content into the session. The session-start hook primes their names so they're discoverable from turn one.
+
+| Skill | Role |
+|---|---|
+| `astra` | Reference for the `astra.yaml` spec: structure, decisions, options, prior insights, findings, evidence, sub-analyses, narrative anchors, composition mechanics. |
+| `lc-cli` | Reference for `lc` workflow: commands, the Spec-Code Invariant, status interpretation, failure diagnosis, multiverse runs, WRROC export. |
+
+## Paper-reproduction bundle
+
+A self-contained toolkit for reproducing published papers in ASTRA. The bundle is co-located so a single `lc init` brings the full toolkit into a project — no plugin marketplace, no separate installs.
+
+| Skill | Role |
+|---|---|
+| [`lc-from-paper`](lc-from-paper/SKILL.md) | **Reproduction driver.** ORIENT-first; one pre-loop phase in the user's main session that asks for the paper, runs `/paper-extraction` inline, interviews the user (grounded in the paper), clones the reference code and runs `/lc-from-code` scan-only (when a repo exists), and drafts the per-paper `constitution.md` + `CLAUDE.md`. Then hands off to a ralph loop whose iterations carry the long middle: ARCHITECT → SPECIFY → LITERATURE → IMPLEMENT → RUN → COMPARE. When the loop closes (constitution `status: closed` after COMPARE returns `pass`), REVIEW runs back in the user's main session. Fidelity intent — captured as prose at ORIENT — is what every iteration reads when sizing its next move, and what COMPARE grades opportunities against. |
+| [`ralph`](ralph/SKILL.md) | The loop substrate. `lc-from-paper`'s ORIENT invokes `/ralph`'s Authoring mode to draft the per-paper constitution; the loop launcher hands off after ORIENT lands. Each iteration runs `/ralph`'s Loop protocol against the constitution. |
+| [`narrative`](narrative/SKILL.md) | Author the `narrative:` prose and decision `rationale:` in `astra.yaml`. Invoked by `lc-from-paper`'s ARCHITECT (for the structural narrative) and SPECIFY (for anchored content narrative). |
+| [`paper-extraction`](paper-extraction/SKILL.md) | Turn an arXiv ID or DOI into a standardized `work/reference/` directory: structural index (figures, tables, outline, citations with resolved DOIs) plus a stub `astra.yaml` for the paper. Primary acquisition path for `lc-from-paper`'s ORIENT (Stage 2); also invoked per cited paper by LITERATURE. |
+| [`check-sentence-by-sentence`](check-sentence-by-sentence/SKILL.md) | Audit paper claims against code locations (`file:line` or `NOT FOUND`). Invoked from `lc-from-paper`'s REVIEW close-out (opt-in); also user-invokable directly. |
+| [`figure-comparison`](figure-comparison/SKILL.md) | Build a self-contained HTML side-by-side: original figures/tables/numerics vs replicated. Invoked from `lc-from-paper`'s REVIEW close-out (mandatory); also user-invokable directly. |
+
+The full reproduction story spans these skills. `lc-from-paper`'s `SKILL.md` names each by role and tells the agent when to invoke them; the siblings stand alone and don't know about `lc-from-paper`.
+
+### Why bundle (not depend on plugin install)
+
+- **Testability.** We want to verify `lc-from-paper` invokes its sibling skills correctly. That only works when all are in the same checkout.
+- **Single install path.** `lc init` brings the full toolkit. Adding a separate plugin-marketplace step is friction we don't need.
+- **Future consolidation is open.** The long-run shape may be `astra` ships skills in `astra`, `lc` ships skills in `lightcone-cli`, plus a centralized external-skills list. Today: bundle it all. See [[lightcone/skills-location-policy]].
diff --git a/claude/lightcone/guides/astra-reference.md b/claude/lightcone/skills/astra/SKILL.md
similarity index 92%
rename from claude/lightcone/guides/astra-reference.md
rename to claude/lightcone/skills/astra/SKILL.md
index 1c4ffec7..e46a044e 100644
--- a/claude/lightcone/guides/astra-reference.md
+++ b/claude/lightcone/skills/astra/SKILL.md
@@ -1,3 +1,17 @@
+---
+name: astra
+description: >
+  Comprehensive reference for the `astra.yaml` specification — top-level
+  structure, sub-analyses, inputs/outputs, decisions and options, prior
+  insights and findings, evidence and quote verification, narrative
+  anchors, and composition mechanics. Invoke whenever reading, writing,
+  validating, or debugging an `astra.yaml` spec; whenever working with
+  decisions, options, prior_insights, findings, or evidence; or whenever
+  the user asks about ASTRA schema, spec syntax, or sub-analysis
+  composition.
+allowed-tools: Read, Glob, Grep, Bash(astra:*)
+---
+
 # ASTRA Reference
 
 ## What an ASTRA Analysis Is
@@ -100,6 +114,17 @@ A decision is a methodological choice where a different defensible option could
 
 Decisions may carry an optional `tags:` list for grouping (e.g. `[preprocessing]`, `[physics]`, `[stats]`). Keep the tag vocabulary **small and consolidated** -- reuse existing tags rather than minting new ones, since tags are mostly useful for cross-cutting views over a shared decision space, and that view fragments quickly when every decision invents its own label.
 
+### Options
+
+Each decision must have at least one option. Options are `key: { ... }` entries:
+
+- `label:` (required) -- short human-readable name for compact rendering.
+- `description:` (optional) -- longer prose explaining what the option means.
+- `insights:` (optional) -- list of `prior_insights:` IDs that justify this option; back-references the supporting evidence (see [Prior Insights and Findings](#prior-insights-and-findings)).
+- `excluded:` + `excluded_reason:` -- option considered but rejected. See [Constraints](#constraints).
+
+`label:` and `options:` are required on the decision itself. An aliased decision (one that points at another via `from: ../decisions.foo` -- see [Composition Mechanics](#composition-mechanics)) inherits both from its source and doesn't redeclare them.
+
 ### Parameterization
 
 **Every decision must be parameterized in code** -- never hardcode a decision value. The recipe's `command:` template references it via `{decisions.<id>}` (see [Command Template Substitution](#command-template-substitution)).
@@ -173,7 +198,7 @@ Two kinds of insight, distinguished by direction:
 - **Prior insights** (`prior_insights:`) — knowledge from outside the analysis that informs decisions. From literature (by DOI) or artifacts from a prior/parent analysis.
 - **Findings** (`findings:`) — conclusions from the analysis itself, backed by its own output artifacts.
 
-Both use the same Insight model: `id`, `label` (optional), `claim`, `created_at`, `evidence`, plus optional `derived` (true if synthesized/inferred from multiple sources), `scope` (applicability conditions), `tags`, `notes`. Placement determines direction.
+Both use the same Insight model. Required: `id`, `claim`, `created_at` (ISO 8601 datetime — e.g. `"2025-02-01T14:00:00"`), `evidence`. Optional: `label`, `derived` (true if synthesized/inferred from multiple sources), `scope` (applicability conditions), `tags`, `notes`. Placement determines direction.
 
 Each evidence item has its own fields: `id`, exactly one of `doi` (literature) or `artifact` (output ID), and either a `quote` (TextQuoteSelector with required `exact`, optional `prefix`/`suffix`) or `location` (FragmentSelector with `value` like `"page=6"` and/or 1-indexed `page`). DOI evidence may add `version` (arXiv version). Artifact evidence may add `snapshot` (path to an immutable artifact copy) and `source_commit` (git commit that produced it).
 
diff --git a/claude/lightcone/skills/check-sentence-by-sentence/SKILL.md b/claude/lightcone/skills/check-sentence-by-sentence/SKILL.md
new file mode 100644
index 00000000..06c0211a
--- /dev/null
+++ b/claude/lightcone/skills/check-sentence-by-sentence/SKILL.md
@@ -0,0 +1,369 @@
+---
+name: check-sentence-by-sentence
+description: >
+  Sentence-by-sentence audit of a paper against an ASTRA project's code. For
+  every claim about implementation or results in the methodology, results,
+  discussion, and appendices, locate the corresponding code (file:line) or
+  mark NOT FOUND. Only the user can invoke this skill, though this skill can be suggested for the user to invoke during paper reproduction. Other skills may mention this skill as an optional follow-up, but should not invoke it themselves. Run from the project folder containing astra.yaml. In lc-from-paper projects, read paper sources from
+  work/reference/: prefer arXiv TeX under work/reference/source/, fall back to
+  Docling/Pandoc markdown at work/reference/document.md.
+argument-hint: "[path to paper source, e.g. work/reference/source/main.tex or work/reference/document.md]"
+---
+
+# /check-sentence-by-sentence
+
+Audit a paper against the code in this ASTRA project, sentence by sentence.
+Every sentence that asserts an implementation detail or a numerical/empirical
+result is located in the code (`file:line`) or marked NOT FOUND. The agent
+does NOT run any code -- this is a static reading audit.
+
+In lc-from-paper projects, the paper substrate comes from `work/reference/`.
+Path A is arXiv source at `work/reference/source/`; Path B is the parsed
+markdown fallback at `work/reference/document.md`, produced by Docling or
+Pandoc.
+
+## Setup
+
+1. **Confirm project root.** Read `astra.yaml` in the current working
+   directory. If it is missing, ask the user:
+
+   > "I do not see an `astra.yaml` in the current directory. Please point me
+   > to the ASTRA project folder, or `cd` there and re-invoke."
+
+   Stop until resolved.
+
+2. **Confirm paper source.** The user may have passed a path as an
+   argument. Resolve it in this order:
+
+   1. If the argument is a `.tex` file, use it in `tex` mode.
+   2. If the argument is `work/reference/` or another directory, first look
+      for TeX source under `<dir>/source/`, then for `<dir>/document.md`.
+   3. If no argument was supplied, prefer the lc-from-paper layout:
+      - `work/reference/source/<main>.tex` if TeX source exists. Identify the
+        main file with `grep -l '\\documentclass' work/reference/source/*.tex`;
+        if exactly one file matches, use it. If multiple files match, ask the
+        user which one is the main paper file. After identifying the main
+        file, expand its local `\input{...}` and `\include{...}` files before
+        section enumeration; many arXiv papers keep most prose outside the
+        main TeX wrapper.
+      - `work/reference/document.md` if there is no TeX source. This is the
+        Docling/Pandoc fallback and should be audited in `markdown` mode.
+   4. Only after those lc-from-paper paths fail, look for an obvious legacy
+      `.tex` source in cwd: a top-level `*.tex`, or one inside `paper/`,
+      `tex/`, or a similarly named subdirectory. If exactly one obvious
+      candidate is found, use it in `tex` mode.
+
+   If no usable source is found, ask:
+
+   > "Which paper source should I audit? Please give me a `.tex` path or
+   > `work/reference/document.md`."
+
+   If only `work/reference/paper.pdf` exists, ask the user to run the PARSE
+   phase first so `work/reference/document.md` exists. Do not audit PDFs
+   directly.
+
+## Section enumeration
+
+This is **your job in the main agent** -- do it carefully so each subagent
+gets a precise line range. Do NOT read full section content; only enough to
+identify boundaries.
+
+1. Enumerate sections according to source mode:
+   - In `tex` mode, first build the ordered audit source list. Start with the
+     main TeX file, scan it for local `\input{...}` and `\include{...}` paths,
+     normalize missing `.tex` suffixes, and include those files when they
+     exist under the same source tree. Recurse one level deeper when an
+     included file itself includes local TeX files. Ignore package/style
+     imports (`\usepackage`, `.sty`, `.cls`) and remote/generated files. If
+     the main file is mostly a wrapper, the leaf included files will carry
+     most audit units.
+   - For every file in the TeX audit source list, use `grep -n` for
+     `^\\section`, `^\\subsection`, and `^\\appendix`. Record each match's
+     file path, line number, and label.
+   - In `markdown` mode, use `grep -n` for markdown headings
+     (`^#`, `^##`, `^###`, etc.) in `work/reference/document.md`. Treat
+     heading depth the way TeX treats section/subsection. If Docling emitted
+     unnumbered headings, use their text labels.
+2. Get the file's total line count with `wc -l`.
+3. Compute each section's line range: **start = the section's own line
+   number; end = (next section/subsection or same/lower heading-depth start
+   minus 1 in the same source file), or that source file's last line for the
+   final section in that file.** For a section that contains subsections,
+   each subsection's range runs from its own line to (next subsection
+   start − 1), and the section's pre-subsection prose (if any) becomes its
+   own audit unit covering (section line + 1) to (first subsection − 1) if
+   that span is non-trivial.
+4. Mark sections appearing after `\appendix` (TeX) or after an `Appendix` /
+   `Appendices` heading (markdown) as appendices regardless of label.
+
+Identify the audit-relevant sections:
+
+- Methodology (often `Methods`, `Analysis`, `Data`, `Sample selection`)
+- Results
+- Discussion (often `Discussion and Conclusions`)
+- Appendices (every section after `\appendix`)
+
+Skip Abstract, Introduction, Acknowledgements, References, author lists.
+
+For each retained section, check whether it has subsections. **Spin up one
+subagent per leaf (sub)section** -- a section with subsections becomes one
+subagent per subsection (plus optionally one for any pre-subsection prose
+span); a section without subsections becomes one subagent for the whole
+section. Spawn them all in a single message so they run in parallel.
+
+## Subagent prompt
+
+Use `Agent(subagent_type="general-purpose", ...)`. Pass each subagent:
+
+- The absolute path to the paper source file for this section
+- The paper source mode: `tex` or `markdown`
+- The exact section/subsection label and the line range in the source file
+  it covers (so it knows where to read)
+- The absolute path to the project root (which contains `astra.yaml`)
+- The instructions below, verbatim
+
+```
+You are auditing one (sub)section of a paper against an ASTRA project's
+code. Your job is mechanical and exhaustive.
+
+INPUTS
+- Paper source file: <path>
+- Source mode: <tex|markdown>
+- Section: <name>, lines <start>-<end>
+- Project root: <path>
+
+PROCEDURE
+1. Read the assigned section of the paper. Split it into sentences using
+   common sense, not naive period-splitting. In `tex` mode, use TeX-aware
+   splitting; in `markdown` mode, preserve Docling/Pandoc math blocks,
+   captions, and headings as source text. Treat `e.g.`, `i.e.`, `et al.`,
+   `Fig.`, `Eq.`, `Sec.`, `Dr.`, decimals (`0.5`), inline math `$...$`,
+   and citation commands (`\citep{...}`, `\citet{...}`) as part of the
+   surrounding sentence, not boundaries. Display equations belong to
+   whichever sentence introduces them.
+2. For each sentence, decide using common sense: does it make a concrete
+   claim about an IMPLEMENTATION DETAIL (a method, parameter, threshold,
+   formula, data cut, model choice, sample definition, algorithmic step)
+   or a RESULTS DETAIL (a numerical value, plot, fitted parameter,
+   statistical outcome)? If neither -- pure motivation, citation prose,
+   or generic framing -- skip it.
+3. Before searching, **read `astra.yaml` once** -- it is a pre-built
+   paper↔code map maintained by the project. Harvest specifically:
+     - `narrative.methods` — links paper methodology concepts to decision
+       IDs (e.g. paper prose "the chosen <method>" → `#decisions.<id>`)
+     - `narrative.findings` — links paper claims/values to result anchors
+     - `prior_insights` (if present) — extracted paper quotes already tied
+       to decisions
+     - per-decision `evidence` quotes and `description` fields
+   Treat these as your translation table: paper prose → decision/output
+   IDs → script files. Do not re-derive what the spec already encodes.
+
+   For everything not covered by the spec, use common sense to translate
+   concepts. In general:
+     - A quality cut stated as a ratio or threshold may appear in code
+       under an inverted form or a different variable name -- map by
+       meaning, not by symbol.
+     - A named model or distribution will usually appear as a function
+       whose name describes its shape or role, not as the paper's prose
+       phrasing.
+     - A cited constant from a referenced paper will usually appear as a
+       module-level constant or as an option value in a decision.
+   Grep for the underlying concept, not just the paper's wording.
+4. For every claim-bearing sentence, search the project code (`scripts/`,
+   source files, `universes/`, `astra.yaml`, `results/`) for where the
+   claim is implemented or computed. Use Grep, Glob, and Read.
+5. Record one of:
+   - (quote, path/file.py:LINE, optional <10-word note)
+     when the sentence's claim is implemented or computed at that location
+   - (quote, NOT FOUND, optional <10-word note)
+     when no implementation or matching computation is present
+
+CONSTRAINTS
+- Do NOT run any code. No Bash beyond ls/grep/find/wc for searching.
+- Do NOT read the paper outside the assigned line range.
+- Quote the sentence verbatim, trimmed to a single sentence. If the
+  sentence is long, you may include just the claim-bearing clause but
+  preserve enough text to identify it.
+- file:line should point to the most specific line that implements or
+  states the claim (the function call, parameter assignment, or computed
+  value -- not just the file).
+- Notes must be under 10 words. Use them for nuance like "approximate
+  match", "different constant", "implemented but commented out",
+  "value computed at runtime, not statically comparable", "produced as
+  figure but printed value not stored".
+- For numerical results that the paper states as a final number, point
+  at the line that computes the value and use a note like "value
+  computed at runtime" -- you cannot verify numerical agreement without
+  executing code, and that is fine.
+
+OUTPUT
+Return a JSON-ish list, one entry per sentence, in paper order:
+
+[
+  {"quote": "...", "location": "scripts/foo.py:142", "note": "..."},
+  {"quote": "...", "location": "NOT FOUND", "note": "..."},
+  ...
+]
+
+Return nothing else.
+```
+
+## Aggregation
+
+When all subagents return, you receive raw entries from every claim-bearing
+sentence each subagent kept. **Do not just concatenate and print them.**
+Two filtering passes happen here, in this order:
+
+### Pass 1 — drop non-computational sentences
+
+Subagents are deliberately generous about what they keep, so the raw list
+contains a long tail of sentences that quote the paper but do not actually
+correspond to anything you would expect to find in code. **Drop any entry
+whose sentence is:**
+
+- **Framing / motivation** — sentences whose job is to set up the next
+  step, e.g. "the first step is...", "to investigate this...", "we want
+  to look at...", "for this reason..."
+- **Citation prose / literature comparison** — sentences that compare to
+  or quote prior literature, e.g. "agrees with values typical of previous
+  measurements...", "much like Author+YYYY they show...", "in particular,
+  Author found <value>..."
+- **Theoretical framing or derivations** — sentences asserting a property
+  expected from theory rather than implemented in code, and restatements
+  of textbook identities used only to introduce the next equation
+- **Rhetorical / interpretive claims** — qualitative readings of a
+  figure or trend, e.g. "the trend clearly has an oscillatory
+  behaviour", "the trend seems to be independent of <variable>", "this
+  supports that..."
+- **Conclusions / justifications / qualitative observations** —
+  "thus we conclude that...", "we choose not to include this
+  because...", "by and large the trends are similar"
+- **Future work / speculation** — "this could be improved by...", "the
+  discrepancy could be explained by..."
+- **Forward/backward references with no claim** — "we discuss this in
+  Sec X below", "as described in Sec Y above"
+- **NOT FOUND entries that fall in any of the above categories** — most
+  framing/motivation sentences will land as NOT FOUND because there is
+  nothing to find. Drop them silently; they are noise, not gaps.
+
+Keep an entry only if it asserts something a reader would expect to be
+implemented or computed: a parameter value, a cut, a formula, an
+algorithmic step, a fitted/measured value, a figure that the project
+should produce, a sample size after a specific cut.
+
+When in doubt about a NOT FOUND, ask: "if this sentence is not in the
+code, is that a real gap?" If no, drop it.
+
+### Pass 2 — deduplicate / merge near-duplicates
+
+Subagents do not see each other, and the same claim is often restated
+across sentences within a (sub)section -- e.g. a prose statement of a
+cut followed by a sentence asserting "this is the only cut we make", or
+two sub-equations of one larger formula that map to the same line.
+Collapse these:
+
+- If two adjacent sentences make the same claim and resolve to the same
+  `file:line`, keep one entry whose quote is the more specific or
+  formula-bearing of the two, and append the other in a short
+  parenthetical only if it adds information.
+- If a paper-text claim and an explicit equation/quoted code map to the
+  same line, prefer the equation/quoted-code form.
+- Do not merge across (sub)sections.
+- Do not merge if the two sentences resolve to different `file:line`
+  locations -- they may look similar but are doing different things.
+
+### Pass 3 — render
+
+After filtering and deduplication, present the result to the user as
+markdown, organized by section -> subsection -> sentence, in paper order:
+
+```
+# Sentence-by-sentence reproduction audit
+
+Paper: <path>
+Project: <path>
+
+## <Section>
+
+### <Subsection>            (omit if no subsections)
+
+- "<sentence quote>"
+  → ✅ `scripts/foo.py:142` -- <note if any>
+
+- "<sentence quote>"
+  → ❌ NOT FOUND -- <note if any>
+- ...
+```
+
+Use `→ ✅ \`file:line\`` for found entries and `→ ❌ NOT FOUND` for
+missing ones. Notes are optional; only include the trailing `-- <note>`
+when the subagent supplied one.
+
+End with a one-line summary:
+
+> N sentences audited across M sections. K implemented, J not found.
+
+### Follow-up suggestion (conditional)
+
+After the summary, scan the NOT FOUND entries and **cluster them**. A
+cluster is a group of NOT FOUND sentences that all relate to the same
+missing piece of work (a missing analysis, a missing diagnostic, an
+unimplemented model variant) -- usually a few consecutive sentences in
+one (sub)section, or sentences that all reference the same concept across
+sections.
+
+**Only emit the follow-up block if there is at least one major
+unimplemented cluster** -- a cluster of genuine missing computation
+substantial enough to be worth offering to add (rule of thumb: ≥3
+sentences of related missing-computation claims, or a single
+heavyweight missing artifact like an entire missing analysis or
+figure). If every NOT FOUND is isolated framing, motivation, or
+qualitative interpretation -- or if the only clusters are tiny -- stop
+after the one-line summary. Do not pad with a follow-up just to have
+one.
+
+When the threshold is met, write a short follow-up block in this shape:
+
+> Major unimplemented clusters: (1) `<short description of cluster 1>`
+> (`<§section>`, ~`<N>` sentences), and (2) `<short description of
+> cluster 2>` (`<§section>`, ~`<N>` sentences). The rest of the NOT
+> FOUND entries are pure framing/motivation/qualitative interpretation,
+> not computational claims. Worth considering as a follow-up if you
+> want full coverage — want me to add `<concrete artifact 1>` and
+> `<concrete artifact 2>`?
+
+Rules for this block:
+- Only call out clusters that look like genuine missing computation, not
+  rhetoric.
+- Keep it to 1–3 clusters. Do not enumerate every NOT FOUND entry.
+- The closing offer must name **concrete artifacts** the user could add
+  (a new output ID, a new script filename, a new decision option, a new
+  figure) -- not vague promises like "fill in the gaps".
+- Cite the section reference in the project's own notation (`§2.1`,
+  `Appendix B`, etc.) and an approximate sentence count.
+- One short paragraph; do not pad.
+
+## Restrictions
+
+- You MUST NOT run project code, recipes, or `lc run`. This is static.
+- You MUST NOT read the paper source wholesale into the main context;
+  delegate to subagents.
+- You MUST NOT modify any project file. Read-only.
+- You MUST NOT fabricate `file:line` locations -- if a subagent's location
+  looks suspicious, ask it to re-verify rather than guessing.
+- You MUST spawn one subagent per leaf (sub)section, in parallel.
+
+## Anti-patterns
+
+- **Auditing intro/abstract** -- skip narrative-only sections; only
+  methodology, results, discussion, and appendices.
+- **Bundling sentences** -- one entry per sentence. Do not collapse
+  multiple claims into one row even if they share a citation or location.
+- **Vague locations** -- a bare filename (`scripts/foo.py`) is not
+  enough; a line number is required for found entries.
+- **Long notes** -- the 10-word cap is a hard limit; reserve notes for
+  signal, not commentary.
+- **Running code to verify** -- this skill is a reading audit. If a claim
+  cannot be verified by reading code alone, mark it found at the
+  computing line and note "value computed at runtime" rather than
+  executing anything.
diff --git a/claude/lightcone/skills/figure-comparison/SKILL.md b/claude/lightcone/skills/figure-comparison/SKILL.md
new file mode 100644
index 00000000..0c5247ce
--- /dev/null
+++ b/claude/lightcone/skills/figure-comparison/SKILL.md
@@ -0,0 +1,578 @@
+---
+name: figure-comparison
+description: >
+  Build a self-contained HTML report comparing the figures, tables, and
+  numerical results in lc-from-paper's `work/reference/` paper substrate
+  against artifacts produced under `results/<universe>/`. When
+  `comparison-report.yaml` or `targets/targets.md` exists, use that scoped
+  target set first; otherwise fall back to paper-driven inventory from arXiv
+  TeX or Docling/Pandoc artifacts under `work/reference/`. Images are
+  base64-embedded; missing matches are flagged. Use when the user says
+  "compare results", "side-by-side comparison", "build comparison HTML", or
+  "did we reproduce the paper". Run from the project folder containing
+  astra.yaml.
+argument-hint: "[path to paper reference dir, e.g. work/reference/]"
+---
+
+# /figure-comparison
+
+Generate a single self-contained HTML report (`.lightcone/comparison.html`)
+that places paper reference artifacts from `work/reference/` on the left
+and the project's reproduced artifacts from `results/<universe>/` on the
+right, with red flags wherever a counterpart is missing. Images are embedded
+as base64 so the HTML is portable. The helper script and intermediate
+manifest also live under `.lightcone/` so they don't pollute the baseline
+results.
+
+## Setup
+
+1. **Confirm project root.** Read `astra.yaml` in the cwd. If missing, ask:
+
+   > "I do not see an `astra.yaml` here. Please `cd` to the ASTRA project
+   > and re-invoke."
+
+   Stop until resolved.
+
+2. **Confirm results exist.** Default universe is `baseline`, unless
+   `comparison-report.yaml` names reproduced files under another universe or
+   the user supplied a universe explicitly. Check `ls results/<universe>/`.
+   If the directory is missing or empty, ask:
+
+   > "I cannot find populated results under `results/<universe>/`. Build the
+   > universe first (`lc run --universe <universe>` or equivalent), then
+   > re-invoke."
+
+   Stop. Do NOT attempt to run the pipeline yourself -- this skill is
+   read-only over the build artifacts.
+
+3. **Locate the paper reference substrate.** The user may have passed a
+   path. Resolve it in this order:
+
+   1. If the argument is a directory containing `metadata.json`,
+      `document.md`, `figures/`, or `tables/`, use that directory as the
+      paper reference root.
+   2. If the argument is an arXiv source directory containing `.tex` files,
+      use it as `source_root`, and use its parent `work/reference/` as the
+      paper reference root when that parent exists.
+   3. If no argument was supplied, prefer lc-from-paper's layout:
+      - `work/reference/source/` when arXiv TeX source exists. Use the TeX
+        files there for labels/captions and the parsed artifacts under
+        `work/reference/{figures,tables,metadata.json}` for renderable
+        reference files.
+      - `work/reference/document.md` plus
+        `work/reference/{figures,tables,metadata.json}` when no TeX source
+        exists. This is the PDF + Docling fallback from lc-from-paper.
+   4. Only after lc-from-paper paths fail, look for a legacy unzipped arXiv
+      dir in cwd: a directory containing both a `*.tex` file and figure
+      files (`*.pdf`, `*.png`, `*.eps`). Common names: `paper_source/`,
+      `arxiv_source/`, `*_Original_Paper/`.
+
+   If no usable reference substrate is found, ask:
+
+   > "Where is the paper reference directory? In a lc-from-paper project this
+   > should usually be `work/reference/`, containing `document.md`,
+   > `metadata.json`, and extracted `figures/` / `tables/`."
+
+   If only `work/reference/paper.pdf` exists, ask the user to run the PARSE
+   phase first so Docling or the TeX parser populates `work/reference/`.
+   Do not compare directly against a whole PDF.
+
+## Phase 1 -- Understand the paper's main results
+
+Read, in this order:
+
+1. **Scoped comparison artifacts, if present.**
+   - If `comparison-report.yaml` exists, treat it as the highest-priority
+     scope because it records what lc-from-paper actually compared. Use its
+     `outputs:` entries, including `type`, `priority`, `paper_value`,
+     `reproduced_value`, `reference_file`, `reproduced_file`, `match`, and
+     `notes` when present.
+   - Else if `targets/targets.md` exists, treat it as the scope ledger. Use
+     only the targets it names, including out-of-scope notes, priorities,
+     reference paths, expected values/trends, and output/spec-home pointers.
+   - If neither file exists, use the default paper-driven flow below and
+     build a best-effort report from `astra.yaml` plus `work/reference/`.
+
+2. **`astra.yaml`** -- specifically `narrative.summary`, `narrative.outputs`,
+   `narrative.findings`, `outputs:`, and `findings:` if present. Use it to
+   map scoped targets to output IDs and to harvest declared findings. Do not
+   assume ASTRA outputs have a dedicated filename-hint field; result paths
+   come from the output ID and the result resolver in Phase 2.
+
+3. **The paper reference substrate**, in this order:
+   - Read `work/reference/metadata.json` when present. It is the primary
+     index for paper figures and tables; its paths are relative to
+     `work/reference/` and usually point into `figures/` or `tables/`.
+   - If `work/reference/source/` exists, grep its TeX files for
+     `\includegraphics`, `\label{fig:...}`, `\caption{...}`, and
+     `\begin{table}` to recover labels/captions that metadata may have
+     missed.
+   - If only `work/reference/document.md` exists, use the markdown plus
+     `metadata.json` as the source of captions, table text, and in-text
+     numerical claims. This is the Docling/Pandoc fallback; preserve its
+     line numbers and do not pretend it is TeX.
+   - Grep the abstract, results, and discussion sections of the TeX or
+     markdown source for in-text numerical claims that look like primary
+     results -- typically a quantity with value + uncertainty (e.g.
+     `$X = a \pm b$ unit`). Prefer values that `astra.yaml`'s `findings:`
+     already names; do not try to extract every number in the paper.
+
+   Do NOT read the paper wholesale. For long papers (>500 lines), read
+   only the abstract, results, and discussion sections.
+
+If the paper is large or has many sections and neither `comparison-report.yaml`
+nor `targets/targets.md` exists, **delegate the figure / table / value
+enumeration to a single subagent** with
+`subagent_type="general-purpose"` -- pass it the paper path, the output
+schema below, and ask it to return only the inventory. One subagent is
+enough; do not fan out. Multiple subagents would have to re-read the
+same file.
+
+## Phase 2 -- Build the comparison manifest
+
+Produce a manifest in memory (you'll write it as JSON in Phase 3) with
+three sections: `figures`, `tables`, `values`. Each entry pairs a
+paper-side artifact with a project-side artifact.
+
+Build entries in this priority order:
+
+1. **From `comparison-report.yaml` if present.** One manifest entry per
+   `outputs.<output_id>` item. Use `type` to route it to `figures`,
+   `tables`, or `values`. Use `reference_file` as the paper-side path and
+   `reproduced_file` as the project-side path when present. Preserve the
+   report's `paper_value`, `reproduced_value`, `match`, and `notes` in the
+   manifest so the HTML reflects the completed COMPARE verdict.
+2. **Else from `targets/targets.md` if present.** One manifest entry per
+   in-scope target. Use each target's reference path under `targets/`, its
+   expected values/trends, and its output/spec-home pointer. If the ledger
+   marks a target out of scope, omit it from the HTML unless the user asked
+   for out-of-scope targets too.
+3. **Else use the default paper-driven inventory.** Enumerate figures,
+   tables, and values from `astra.yaml` plus `work/reference/`, and fall back
+   to filename-stem similarity only when no scoped ledger exists.
+
+For project-side result paths, resolve every output ID with this order:
+- Use an explicit `reproduced_file` from `comparison-report.yaml` or an
+  explicit reproduced path/glob from `targets/targets.md`, if present and
+  the file exists.
+- Search for flat files at `results/<universe>/<output_id>.<ext>` with the
+  first suitable type-specific extension: images (`.png`, `.jpg`, `.jpeg`,
+  `.pdf`, `.eps`), tables (`.csv`, `.parquet`, `.md`, `.txt`), values
+  (`.json`, `.yaml`, `.yml`, `.txt`, `.md`).
+- If still unmatched and no scoped ledger exists, fall back to filename-stem
+  similarity within `results/<universe>/`.
+- If no match is found, use `project_path: null` and render a red
+  `NOT PRODUCED` panel. Do not include unrelated result files; the report is
+  target-driven when target/report files exist, and paper-driven otherwise.
+
+For tables: use `work/reference/metadata.json` and `work/reference/tables/`
+when present. If TeX source exists, capture the raw LaTeX of the `tabular`
+block and any `\caption{...}`. If only `work/reference/document.md` exists,
+capture the Docling/Pandoc markdown table or the extracted table artifact
+under `work/reference/tables/`. The project side is whatever artifact
+carries the same content -- typically a CSV / parquet / markdown file at
+`results/<universe>/<output_id>.<ext>`. If `astra.yaml` declares no matching
+output, use `project_path: null`. **If the paper contains no tables at all,
+leave the manifest's `tables` list empty; the helper must omit the entire
+Tables section from the HTML in that case (no header, no "no tables"
+placeholder).**
+
+For values: each entry is `{name, paper_value, paper_uncertainty?,
+project_value?, project_value_source?, paper_quote}`. Pull
+`paper_value` from the in-text claim or `astra.yaml`'s
+`findings.*.paper_value`. Pull `project_value` from
+`astra.yaml`'s `findings.*.replicated_value` if present, otherwise from
+a scoped `comparison-report.yaml` entry or a flat result summary file at
+`results/<universe>/<output_id>.<ext>` that you can read statically.
+**Never compute or re-derive values yourself.** If no project value can
+be located statically, leave it null and flag in the HTML.
+
+When `comparison-report.yaml` or `targets/targets.md` exists, the values list
+is scoped to that file. Otherwise, be exhaustive about values, not selective.
+A common failure mode is the values section ending up with only 1--3 entries,
+which makes the report feel thin. Aim for **every** numerical claim that the
+paper asserts and the project tracks. Concretely, harvest from:
+- Every entry under `findings:` in `astra.yaml` -- one manifest entry
+  per finding, even when several findings share a parent quantity.
+- The paper's abstract: every `<value> ± <unc> <unit>` it reports.
+- The paper's results and discussion sections: every fitted parameter,
+  every feature location ("dip near x = X₁", "peak at x = X₂"), every
+  reported sample size after a specific cut, every bin width or step
+  used as a result-defining choice, every reported accuracy / score /
+  metric.
+- Any explicit reproduction targets in `astra.yaml`'s `narrative.findings`.
+
+It is fine to repeat one quantity in multiple manifest entries when the
+paper reports it under different conditions (preliminary vs. final,
+per-subset, per-bin median, per-method variant). Each condition is its
+own row. Feature locations are values too: encode "feature located at
+domain coordinate X" as
+`{name: "<short feature name>", paper_value: "<X>", paper_unit:
+"<unit>"}`. **Target ≥6 value entries on a typical paper.** If you end
+up with fewer than 4, you are filtering too aggressively -- re-read
+`astra.yaml`'s `findings:` and the paper's results section.
+
+## Phase 3 -- Generate the HTML
+
+Use a small Python helper rather than embedding base64 inline through
+your tool calls -- multi-MB image base64 strings would balloon your
+context.
+
+Use the existing `.lightcone/` directory in the project root. Do not create
+directories in this skill. All three files this skill writes -- manifest,
+helper, and final HTML -- live there.
+
+1. **Write the manifest** as JSON to
+   `.lightcone/comparison_manifest.json`. Schema:
+
+   ```json
+   {
+     "project_name": "...",
+     "paper_path": "work/reference/document.md",
+     "scope_source": "comparison-report.yaml",
+     "universe": "baseline",
+     "results_path": "results/baseline",
+     "figures": [
+       {
+         "paper_label": "fig:main_result",
+         "paper_caption": "...",
+         "paper_path": "targets/main_result.pdf",
+         "project_output_id": "primary_metric_plot",
+         "project_path": "results/baseline/primary_metric_plot.png"
+       }
+     ],
+     "tables": [
+       {
+         "paper_label": "tab:summary",
+         "paper_caption": "...",
+         "paper_latex": "\\begin{tabular}{...}\\end{tabular}",
+         "project_output_id": "...",
+         "project_path": "results/baseline/summary_table.csv"
+       }
+     ],
+     "values": [
+       {
+         "name": "primary_metric",
+         "paper_value": "12.5",
+         "paper_uncertainty": "0.4",
+         "paper_unit": "<unit>",
+         "paper_quote": "we find $\\mathrm{metric} = 12.5 \\pm 0.4$ <unit>",
+         "project_value": "12.47",
+         "project_uncertainty": "0.41",
+         "project_value_source": "results/baseline/metric.json"
+       }
+     ]
+   }
+   ```
+
+   `figures`, `tables`, and `values` may each be `[]`. Empty lists mean
+   the helper skips that section entirely. There is no
+   `unmatched_baseline` field -- baseline files the paper does not
+   reference are not in scope for this report.
+
+   Use `null` for any missing field. Paths are relative to the project
+   root.
+
+2. **Write the helper script** to `.lightcone/build_comparison.py`.
+   The helper must:
+   - Read the manifest JSON.
+   - For each figure entry: emit one `<section class="row">` per figure,
+     with the structure described in **"Required HTML structure"**
+     below -- a single `<div class="row-head">` containing a
+     `<div class="row-title">` and one row-level status badge, followed
+     by a `<div class="row-grid">` of two `<figure class="cell">`s
+     (paper, project). One badge per row, in flow inside `.row-head`.
+     **Never emit per-cell absolutely-positioned badges.**
+     Read `paper_path` and `project_path` as bytes, base64-encode, and
+     embed each image inside its cell. **PDFs must be converted to PNG
+     before base64-encoding -- never embed PDFs as PDF data URIs.** Use
+     `<img src="data:image/png;base64,...">` uniformly for every
+     figure cell. Conversion order to try, falling back if a tool is
+     unavailable:
+       1. `pdf2image` (Python) -- `convert_from_path(path, dpi=150)[0]`
+       2. `pypdfium2` -- render page 1 at 150 DPI to a PIL image
+       3. shell out to `pdftoppm -png -r 150 -f 1 -l 1 <pdf> <stem>`
+          and read the resulting PNG
+       4. shell out to `magick <pdf>[0] -density 150 <png>` (ImageMagick)
+     If none are available, the helper renders a small ⚠️ panel that
+     says `PDF preview unavailable -- install pdf2image or pdftoppm`
+     and links to the `.pdf` file path. Do not fall back to embedding
+     the PDF binary. PNG / JPG inputs skip conversion and are
+     base64-encoded directly. For any non-image type, embed as a
+     UTF-8 text block. Missing path → render a red panel saying
+     `❌ NOT PRODUCED` with the expected output ID. Captions live as
+     `<figcaption>` inside each cell, never as a row-spanning element.
+   - For each table entry: paper side renders the captured LaTeX inside
+     `<pre>` plus the caption; project side renders the project file
+     (CSV/parquet → first ~20 rows as an HTML table; markdown → render
+     as `<pre>`; missing → red ❌ panel). Same row structure as figures.
+   - For each value entry: emit one `<section class="row value-row">`
+     per value -- **same card layout as figures, not a `<table>`.**
+     The row has a `.row-head` (value name + single status badge),
+     a `.row-grid` of two `.cell`s (paper | project), and a trailing
+     `.value-note` with the σ delta. The paper cell shows the value
+     (with uncertainty and unit) and the `paper_quote` as a
+     `<blockquote>`. The project cell shows the value and the
+     `project_value_source` as a small `<code>` line. Compute a simple
+     status -- ✅ if both values exist and the project value lies within
+     ±1 paper-uncertainty of the paper value; ⚠️ if both exist but
+     disagree by more than that; ❌ if either is missing. If
+     `paper_uncertainty` is null, fall back to a 5%-tolerance
+     comparison: ✅ if `|prj − paper| ≤ max(0.05·|paper|, 0.05)`. Do
+     NOT do anything more sophisticated; you cannot run code. **Do not
+     render values as a single HTML `<table>`** -- the report's whole
+     point is side-by-side cards.
+   - Emit a single self-contained HTML file with inline CSS in the
+     **Vellum** aesthetic (see below): the `<body>` carries the
+     parchment background and grain, and **all content lives inside a
+     single `<div class="page">` that is the lighter `--surface` cream
+     card with soft drop shadows.** This is non-negotiable -- the cream
+     page card on top of the parchment body is the headline visual. Two
+     content columns (paper | project) per row, the project name in the
+     `<h1>`, and a top-of-page summary line counting found / missing
+     for each non-empty section. **Skip any section whose manifest list
+     is empty** -- omit its header and content entirely; do not emit a
+     "no tables found" placeholder.
+   - Write the HTML to `.lightcone/comparison.html` and print the
+     absolute path on stdout.
+
+### Required HTML structure (figures and values)
+
+The helper MUST produce this exact shape for every figure / value row.
+Per-cell absolute badges, value-as-table, and missing `.row-head` are
+all forbidden -- they break the layout (overlapping the cell heading,
+losing the row-level status, breaking the visual rhythm with figures).
+
+```html
+<section class="row"><!-- or "row value-row" for values -->
+  <div class="row-head">
+    <div class="row-title">
+      <code>fig:main_result</code> &mdash; <span class="row-id">primary_metric_plot</span>
+    </div>
+    <span class="badge badge-ok">✅ matched</span>
+  </div>
+  <div class="row-grid">
+    <figure class="cell">
+      <div class="cell-label">PAPER</div>
+      <img src="data:image/png;base64,...">
+      <figcaption>Caption from paper.</figcaption>
+    </figure>
+    <figure class="cell">
+      <div class="cell-label">PROJECT &middot; <code>results/baseline/...</code></div>
+      <img src="data:image/png;base64,...">
+      <figcaption>output_id</figcaption>
+    </figure>
+  </div>
+  <!-- value rows only: -->
+  <div class="value-note">Δ = 0.03 &lt;unit&gt; (0.07σ)</div>
+</section>
+```
+
+Status states for the row badge: `badge-ok` (matched), `badge-warn`
+(partial / off-target / no σ), `badge-miss` (missing on either side).
+Exactly one badge per row.
+
+3. **Run the helper:** `python3 .lightcone/build_comparison.py`
+   from the project root. If `python3` is missing, try `python`. If
+   the helper imports anything beyond the standard library (e.g.
+   `pyarrow` to read parquet, or `pandas` to render tables), have it
+   gracefully fall back to "preview not available -- file exists at
+   `<path>`" rather than failing. The helper must work with stdlib
+   alone for the figure path; the parquet / pandas previews are
+   nice-to-haves.
+
+4. After the helper runs, **read back** the HTML's first ~50 lines and
+   the absolute file size to verify it was produced and isn't trivially
+   small (>10 KB sanity check). Then report to the user the path and a
+   one-line summary:
+
+   > Comparison HTML at `.lightcone/comparison.html` -- N figures
+   > (K matched, J missing), N tables (...), N values (...).
+
+## Vellum aesthetic
+
+The helper must style the page in the **Vellum** aesthetic: a
+weathered-parchment look that reads like a printed scientific paper,
+not a web app. The helper bakes all of this into inline `<style>` --
+no external assets, no CDN fetches, no JS.
+
+**Palette (CSS custom properties on `:root`):**
+
+```css
+--paper:        #F2EDE5;  /* aged-paper page background */
+--surface:      #FAFAF7;  /* lighter "protected" prose surface */
+--ink:          #2E2A26;  /* warm near-black body text */
+--ink-muted:    #6B635A;  /* brown-gray secondary text */
+--gold:         #9A7B35;  /* antique gold -- links, accents, the author's hand */
+--teal:         #4F7A6F;  /* faded ink: healthy / resolved (✅) */
+--amber:        #B0823A;  /* faded ink: attention / partial (⚠️) */
+--mauve:        #8A5C6B;  /* faded ink: error / missing (❌) */
+--rule:         #D9CFC0;  /* hairlines and table borders */
+--shadow:       rgba(46, 42, 38, 0.10);  /* soft ink-toned drop shadow */
+```
+
+Saturated colors are forbidden. Use only this palette plus tints/shades
+of these tokens. Status icons (✅ ⚠️ ❌) are kept but their containers
+adopt the corresponding faded ink (`--teal`, `--amber`, `--mauve`) for
+borders and small badges -- never as full background fills.
+
+**Typography:**
+
+- Body prose: `EB Garamond`, fall back through `Garamond, "Times New
+  Roman", Georgia, serif`. No system-ui, no sans-serif anywhere.
+- Annotations, code, captions, file paths, numerical values:
+  `JetBrains Mono`, fall back through `"IBM Plex Mono", "SFMono-Regular",
+  Menlo, Consolas, monospace`.
+- Body line-height ~1.55, comfortable measure (~70ch on prose blocks).
+- Headings serif, semibold not bold; `<h1>` slightly tracked-out (small
+  positive `letter-spacing`) for a hand-set feel. Section headings
+  may use a small caps treatment (`font-variant: small-caps`).
+- Do not load webfonts. The HTML must stay self-contained and offline-safe;
+  rely on the fallback chains above.
+
+**Texture and the page card:**
+
+- The `<body>` background is `--paper` plus a barely-there fractal-noise
+  grain. Generate the grain with an inline SVG `<feTurbulence>` filter
+  baked into a `data:image/svg+xml;base64,...` URL used as
+  `background-image`. Keep opacity low (~0.04--0.06) so the grain reads
+  as paper fiber, not as visible noise.
+- **Body padding around the page.** The `<body>` itself has padding
+  (e.g. `padding: 4rem 2rem;`) so the parchment + grain breathes around
+  the page card -- never edge-to-edge.
+- **The page card is mandatory.** All content lives inside a single
+  `<div class="page">` styled as:
+
+  ```css
+  .page {
+    max-width: 64rem;
+    margin: 0 auto;
+    background: var(--surface);
+    box-shadow: 0 1px 2px var(--shadow), 0 8px 24px var(--shadow);
+    padding: 4rem 4rem 5rem;
+  }
+  ```
+
+  The cream `--surface` card on top of the parchment `--paper` body is
+  the single most important visual signature of the report. If you find
+  yourself with `.page { background: transparent }` or no
+  `box-shadow`, you have failed.
+- Cells inside the page card sit on the same `--surface` with their own
+  softer shadow (`0 1px 2px var(--shadow)`), creating two stacked
+  layers of depth: parchment → page card → cell card.
+
+**Surfaces and overlays:**
+
+- Comparison rows are two-column on desktop (paper | project), single
+  column on narrow viewports. Each cell is `--surface` with the soft
+  ink shadow.
+- Hover/active states are expressed as **candlelight-lift** (a warm
+  cream highlight, e.g. `background: #FFF8E8;`) or **ink-sink** (a warm
+  black inset, e.g. `background: #2E2A26; color: var(--paper);`) --
+  never flat blue/gray fills.
+- Hairlines between sections use `--rule`, never solid black.
+
+**Chrome and links:**
+
+- Links: `--gold`, no underline by default; underline appears as a
+  1px `--gold` border-bottom on hover. The underline is the "author's
+  hand" -- thin, deliberate.
+- Buttons / interactive chrome: minimal. This is a report, not an app.
+  Avoid icons beyond ✅ ⚠️ ❌ and small unicode dingbats.
+
+**Whitespace and rhythm:**
+
+- Generous outer margins; the page should feel narrow and read like a
+  paginated paper. Max content width around 64rem.
+- Section transitions get vertical room -- ~3rem between major
+  sections, ~1.5rem between rows.
+- Captions sit below figures in `--ink-muted` mono, italic if EB
+  Garamond italics are loaded.
+
+**Status badges (figure / table / value rows):**
+
+- **One badge per row, in the `.row-head` flex container alongside the
+  row title.** Never per-cell, never absolutely positioned. The badge
+  uses `display: inline-flex` (or default inline) and lives in flow.
+- Render the status (✅ matched / ⚠️ partial / ❌ missing) as a small
+  monospace badge in the row head, using the corresponding semantic color
+  as a 1px border + the same color tinted at 12% as the background. The
+  icon plus a 1--3 word label ("matched", "missing", "off by 2.1σ"). Never
+  a saturated banner.
+- Status reflects the **row as a whole**, not each cell individually:
+  ✅ when both paper and project artifacts are present (and, for
+  values, the project number is within tolerance); ⚠️ when both are
+  present but the value disagrees beyond tolerance, or a paper figure
+  has no project counterpart that you'd still like to flag as partial;
+  ❌ when either side is missing.
+
+**The overall feel:** scholarly, low-contrast, hand-made, generous
+whitespace, chrome recedes, the page itself carries the eye. If a
+choice feels modern (sharp shadows, saturated badges, system-ui type,
+solid-fill buttons), it is wrong.
+
+## Restrictions
+
+- You MUST NOT run the pipeline, recipes, `lc run`, or any code that
+  computes new results. The results directory is read-only input here.
+- You MUST NOT modify project source code, `astra.yaml`, or anything in
+  `scripts/` or `results/`. The only files this skill writes are
+  `.lightcone/comparison_manifest.json`,
+  `.lightcone/build_comparison.py`, and
+  `.lightcone/comparison.html`. Assume `.lightcone/` already exists; never
+  write into `results/`.
+- You MUST NOT fabricate values. If a paper number is not stated in the
+  paper source, `targets/targets.md`, `comparison-report.yaml`, or
+  `astra.yaml`, leave it null. If a project number is not recorded in a
+  result file or comparison report, leave it null. Flag, don't fill.
+- You MUST embed every image as base64 -- the HTML must be portable to
+  another machine without breaking image references.
+- You MUST NOT write the HTML by hand with inlined base64 strings; use
+  the helper script. (Multi-MB base64 in tool-call arguments is what
+  this rule prevents.)
+
+## Anti-patterns
+
+- **Running the pipeline to fill in a missing value** -- the whole point
+  is to surface what is missing; never paper over a gap.
+- **Embedding PDFs as PDFs** -- PDFs must be rasterized to PNG before
+  base64-encoding. Browsers can technically render PDF data URIs, but
+  they break consistent layout, scale poorly, and force a viewer
+  chrome we cannot style. Convert to PNG via `pdf2image` /
+  `pypdfium2` / `pdftoppm` / ImageMagick (in that fallback order); if
+  none are available, render a ⚠️ placeholder rather than embedding
+  the PDF.
+- **Statistical comparison beyond ±1σ** -- this skill is a static visual
+  comparison plus a coarse value check. Do not compute KS tests, Δχ²,
+  or anything else. The user can eyeball the figures.
+- **Reading the paper wholesale** -- limit reads to abstract, results,
+  discussion; or delegate the inventory pass to one subagent.
+- **Bundling matching into the helper script** -- the helper's job is
+  rendering, not deciding which paper figure pairs with which baseline
+  file. Do all matching in Phase 2 (manifest construction) so a human
+  can audit the pairings by reading the JSON.
+- **Silent overwrites** -- if `.lightcone/comparison.html` already
+  exists, mention it in the summary line ("overwrote previous report").
+- **Modern web-app styling** -- saturated brand colors, system-ui type,
+  flat-fill buttons, sharp drop shadows, dark-mode toggles, animated
+  transitions. The Vellum aesthetic is non-negotiable; if you find
+  yourself reaching for `#0d6efd` or `font-family: system-ui`, stop.
+- **Missing page card.** The single `<div class="page">` with
+  `background: var(--surface)` + soft drop shadow is the headline
+  visual. A page that lets the parchment grain reach edge-to-edge with
+  no cream card on top is broken. Always check the rendered HTML has
+  `.page { background: var(--surface); box-shadow: ... }`.
+- **Per-cell absolutely-positioned badges.** Status badges live inside
+  one `.row-head` per row, in flow next to the row title -- never
+  `position: absolute; top: 0.7rem; right: 0.8rem;` inside each cell.
+  The absolute positioning overlaps the cell heading and emits a
+  "rendered" badge per existing file regardless of the row's overall
+  comparison state, which destroys the at-a-glance status signal.
+- **Values rendered as a `<table>`.** Values must use the same card
+  layout as figures (`.row` → `.row-head` + `.row-grid` of two
+  `.cell`s). Collapsing the values section to an HTML table looks like
+  a spreadsheet and breaks visual rhythm with the figures section.
+- **Thin values list.** Aim for ≥6 value entries on a typical paper.
+  If the manifest ends up with 1--3 values, the report feels empty;
+  re-harvest from `astra.yaml`'s `findings:` and the paper's results
+  section before generating.
diff --git a/claude/lightcone/guides/lightcone-cli-reference.md b/claude/lightcone/skills/lc-cli/SKILL.md
similarity index 90%
rename from claude/lightcone/guides/lightcone-cli-reference.md
rename to claude/lightcone/skills/lc-cli/SKILL.md
index 7868ceff..66e9b643 100644
--- a/claude/lightcone/guides/lightcone-cli-reference.md
+++ b/claude/lightcone/skills/lc-cli/SKILL.md
@@ -1,6 +1,20 @@
+---
+name: lc-cli
+description: >
+  Reference for `lc` CLI execution: commands (init/run/status/verify/build/export),
+  the Spec-Code Invariant (`astra.yaml` and code never diverge), status
+  interpretation (ok/stale/missing/alias), failure diagnosis, multiverse
+  runs, scratch overrides for HPC, sub-analysis scaffolding, publishing
+  via WRROC. Invoke whenever running, debugging, or diagnosing `lc`
+  workflows; whenever interpreting `lc status` / `lc verify` output; or
+  whenever the user asks about the development workflow surrounding
+  `astra.yaml`.
+allowed-tools: Read, Glob, Grep, Bash(lc:*), Bash(astra:*)
+---
+
 # lightcone-cli Reference
 
-Reference for lightcone-cli execution: CLI commands, development workflow, status interpretation, and failure diagnosis. For `astra.yaml` spec syntax, see `astra-reference.md`.
+Reference for lightcone-cli execution: CLI commands, development workflow, status interpretation, and failure diagnosis. For `astra.yaml` spec syntax, invoke `/astra`.
 
 ## CLI Reference
 
@@ -11,7 +25,6 @@ lc build [--force] [--runtime docker]                             # Build contai
 lc status [--universe NAME] [--json]                              # Materialization status (text or JSON)
 lc verify [--universe NAME]                                       # Recompute hashes and walk the provenance chain
 lc export wrroc [--output PATH] [--universe NAME] [--zip] [--metadata-only] [--author "NAME <EMAIL>"]  # Export Workflow Run RO-Crate bundle
-lc eval {run,report,compare}                                      # Run/inspect eval suites (requires the 'eval' extra)
 ```
 
 `lc run` is quiet by default — pass `--verbose` to see worker output. `--scratch` is only relevant on HPC sites where `$HOME` doesn't honor `flock` (NERSC etc.); it redirects Snakemake state and Dask spill onto the named filesystem.
@@ -33,7 +46,7 @@ Sub-analyses are scaffolded by hand, since each one is just another `astra.yaml`
 2. Add a `path:` entry to the parent `astra.yaml` under `analyses:` (e.g. `analyses: { my_sub: { path: ./analyses/my_sub } }`).
 3. Add a `<name>: { universe: baseline }` entry to each existing parent universe file.
 
-Populate the sub-analysis's `astra.yaml` with inputs, outputs, and decisions. Use `from:` references to wire inputs and decisions to the parent or siblings — see `astra-reference.md` under "Composition Mechanics."
+Populate the sub-analysis's `astra.yaml` with inputs, outputs, and decisions. Use `from:` references to wire inputs and decisions to the parent or siblings — invoke `/astra` and see "Composition Mechanics" for the grammar.
 
 ## Development Workflow
 
diff --git a/claude/lightcone/skills/lc-feedback/SKILL.md b/claude/lightcone/skills/lc-feedback/SKILL.md
index 7cf6f09c..23cf7b10 100644
--- a/claude/lightcone/skills/lc-feedback/SKILL.md
+++ b/claude/lightcone/skills/lc-feedback/SKILL.md
@@ -3,7 +3,6 @@ name: lc-feedback
 description: >
   File a bug report from the current session. Use when something breaks:
   /lc-feedback <description of what went wrong>
-allowed-tools: Bash(gh:*), Bash(python:*), Bash(uname:*), AskUserQuestion
 argument-hint: "<what went wrong>"
 ---
 
diff --git a/claude/lightcone/skills/lc-migrate/SKILL.md b/claude/lightcone/skills/lc-from-code/SKILL.md
similarity index 53%
rename from claude/lightcone/skills/lc-migrate/SKILL.md
rename to claude/lightcone/skills/lc-from-code/SKILL.md
index d5f42389..b4da467a 100644
--- a/claude/lightcone/skills/lc-migrate/SKILL.md
+++ b/claude/lightcone/skills/lc-from-code/SKILL.md
@@ -1,20 +1,28 @@
 ---
-name: lc-migrate
-description: Migrate an existing project into ASTRA / lightcone-cli. Scans code, generates astra.yaml, parameterizes decisions, and runs until outputs materialize. Triggers on "migrate", "convert", "existing project".
-allowed-tools: Read, Write, Edit, Glob, Grep, Bash(astra:*), Bash(lc:*), Bash(python:*), Bash(pip:*), Bash(git:*), Bash(mkdir:*), Bash(ls:*), Agent, AskUserQuestion
+name: lc-from-code
+description: Bring an existing project into ASTRA / lightcone-cli, starting from the code. Scans the codebase, drafts or augments astra.yaml, parameterizes decisions, and runs until outputs materialize. Triggers on "migrate", "convert", "existing project", "wrap this code", "start from code".
 ---
 
-# /lc-migrate
+# /lc-from-code
 
-End-to-end migration: scan existing code, generate the ASTRA spec, parameterize decisions in the code, and run until everything materializes. The user's existing logic stays intact — changes should be minimal.
+End-to-end migration: scan existing code, draft or add to `astra.yaml`, parameterize decisions in the code, and run until everything materializes. This works both as a fresh start from code and as an augmenting pass inside an existing ASTRA project. The user's existing logic stays intact — changes should be minimal.
 
-## References
+## Invocation contexts
 
-- [ASTRA Reference](../../guides/astra-reference.md) -- spec structure, decision identification, recipes, universes
+This skill has two invocation contexts. The first is the user-driven default described in the phases below: do the full scan → spec → parameterize → run flow.
+
+The second is **scan-only**, used when `/lc-from-paper`'s ORIENT Stage 4 invokes this skill against a cloned reference repo at `work/reference/code/`. The invocation prompt will tell you explicitly to *do only Phase 1's scan*, write the inventory to a path it specifies (typically `work/reference/code-index.md`), and **stop** — do not touch `astra.yaml` at the project root, do not parameterize any code, do not run anything, do not modify the cloned repo. Reach for an Explore sub-agent (or parallel Explore spawns when the repo is large enough that one survey misses the breadth) — that's the cost-effective tool for inventorying a real codebase, and there's no longer any nested-context concern that would forbid it. Trust the invocation prompt's instructions over the fresh-migration defaults below; if the prompt says scan-only, the scan-only contract holds (stop after writing the inventory file).
 
 ## Phase 1: Scan & Spec
 
-First, read the Decisions section of [ASTRA Reference](../../guides/astra-reference.md), then spawn an Explore subagent to scan the project. Include the decision criteria in the prompt so the subagent can classify candidates:
+First, invoke `/astra` and read its Decisions section, then decide which mode applies:
+
+- **Fresh migration:** no meaningful `astra.yaml` exists yet. Use the code scan to draft `astra.yaml` and `universes/baseline.yaml`.
+- **Augment existing ASTRA:** `astra.yaml` already exists from a paper, user interview, or prior ASTRA work. Use the code scan to add to the current spec — recipes, dependencies, containers, code-backed decision options, baseline selections, implementation notes, and missing inputs / outputs where they naturally belong. Do not create a second `astra.yaml`, do not replace the existing structure wholesale, and surface major structure conflicts to the user before reshaping the spec.
+
+### Scanning the project
+
+In both modes, spawn an Explore sub-agent to scan the project. Include the decision criteria in the prompt so the sub-agent can classify candidates:
 
 ```
 Agent(subagent_type="Explore", prompt="""
@@ -40,16 +48,20 @@ Return the results as a markdown table:
 
 And a separate list of ALL candidate decisions with file:line references.
 Err on the side of completeness — include anything that could plausibly
-be an analytical choice. The orchestrator will filter down later.
+be an analytical choice. The caller will filter down later.
 
 For reference, here are the decision criteria for classifying candidates:
 <decision-criteria>
-{paste Decisions section from astra-reference.md here}
+{paste Decisions section from `/astra` here}
 </decision-criteria>
 """)
 ```
 
-Write the scan results to `CLAUDE.md` under `## Project Notes` as a script inventory, then draft `astra.yaml` from the scan results following the spec structure documented in `.claude/guides/astra-reference.md`. Use the decision criteria from [ASTRA Reference](../../guides/astra-reference.md) to filter the subagent's candidate decisions down to only true analytical choices — most hardcoded values are implementation details, not decisions. Use current hardcoded values as defaults.
+When the codebase is large enough that one Explore pass risks missing depth (a multi-project monorepo, a workflow folder plus a notebooks tree plus a `src/` package), spawn Explores in parallel against the named subtrees — one Explore per coherent region. Aggregate their inventories into the final scan output.
+
+Write the scan results to `CLAUDE.md` under `## Project Notes` (fresh migration) or to the path the invocation prompt specifies (scan-only — typically `work/reference/code-index.md`) as a script inventory, then in fresh migration mode draft or add to `astra.yaml` from the scan results following the spec structure documented in `/astra`. In scan-only mode, stop after the inventory file lands; do not touch `astra.yaml`. Use the decision criteria from `/astra` (Decisions section) to filter candidate decisions down to only true analytical choices — most hardcoded values are implementation details, not decisions. Use current hardcoded values as defaults.
+
+In augment mode, preserve the existing paper-derived or user-derived `inputs`, `outputs`, `decisions`, `findings`, and `narrative` unless the code scan shows a real conflict. Attach code evidence to the nearest existing home first. Create new ASTRA structure only when the code reveals a real analysis object that has no suitable home in the current spec.
 
 For each output, list the upstream artifacts it depends on under `Output.inputs: [...]` and the decisions it consumes under `Output.decisions: [...]`. Then add a `recipe.command` template that references each via `{inputs.<id>}` / `{decisions.<id>}` and writes to `{output}`. Example:
 
@@ -68,7 +80,7 @@ outputs:
         --output {output}
 ```
 
-Also generate `universes/baseline.yaml` with all defaults matching the current hardcoded values (so the first run reproduces existing behavior).
+Also generate or update `universes/baseline.yaml` with all defaults matching the current hardcoded values (so the first run reproduces existing behavior).
 
 Write to `astra.yaml` and `universes/baseline.yaml`, then validate: `astra validate astra.yaml`. Fix any errors.
 
@@ -76,7 +88,7 @@ Use `AskUserQuestion` to ask the user to review the spec — they can open `astr
 
 ## Phase 2: Implement
 
-Parameterize the code so decisions can be varied across universes. The goal is minimal changes to user code. Use your best judgement for the approach — the options below are not exhaustive:
+Parameterize the code from ASTRA decisions so the baseline run reproduces the existing behavior. The goal is minimal changes to user code. Use your best judgement for the approach — the options below are not exhaustive:
 
 **For scripts with hardcoded values:** Add argparse (or extend existing argument parsing) and replace hardcoded values with the parsed args. This is the simplest case.
 
diff --git a/claude/lightcone/skills/lc-from-paper/SKILL.md b/claude/lightcone/skills/lc-from-paper/SKILL.md
new file mode 100644
index 00000000..78e79db8
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/SKILL.md
@@ -0,0 +1,164 @@
+---
+name: lc-from-paper
+description: >
+  This skill should be used when the user wants to reproduce a published
+  scientific paper in ASTRA — has a DOI, arXiv ID, or PDF — or asks to
+  "reproduce <paper>", "set up reproduction", or "import a paper". Also
+  use when continuing or resuming an existing reproduction workdir. The
+  skill instructs Claude to run ORIENT in the user's main session
+  (paper-extraction + interview + code scan, all grounded), then hand
+  the reproduction off to a ralph loop whose iterations carry the
+  remaining phases (ARCHITECT → SPECIFY → LITERATURE → IMPLEMENT → RUN
+  → COMPARE) until the constitution closes, at which point REVIEW
+  close-out runs back in the user's main session.
+---
+
+# lc-from-paper
+
+You are helping the user reproduce a published scientific paper as a complete ASTRA project. This is a long, complex task that won't fit in a single context window — it spans discrete phases: orient (figure out what the user wants, acquire paper + code), architect the spec, specify decisions and findings, resolve cited literature, implement, run, compare, review.
+
+The architecture is two-piece:
+
+1. **Interactive bookends in the user's main session.** ORIENT and REVIEW are conversations with the user. ORIENT runs in stages: ask for the paper, run `/paper-extraction` inline, interview the user (grounded in the paper), clone the code and run `/lc-from-code` scan-only (if a repo exists), possibly ask follow-up questions, then draft `constitution.md` + `CLAUDE.md` from the full paper-plus-code context for user review.
+
+2. **A ralph loop for the long middle.** Once ORIENT lands — `constitution.md` + `CLAUDE.md` drafted, paper and code substrate on disk — you launch a ralph loop against the constitution. Each iteration starts a fresh session with the constitution loaded into its system prompt, surveys the workdir, picks the next valuable move (typically one phase's worth of work), does it, commits, and exits. Iteration N+1 reads N's work cold, so per-phase review collapses into "the next iteration is the review."
+
+The whole thing is driven by **the per-paper `constitution.md`** at the reproduction workdir root, plus the auto-loading `CLAUDE.md` walk-up. The split is intentional: the constitution is *task-bound* (what this reproduction is trying to achieve — Goal, fidelity intent, scope, quality bar, Open dimensions) and can be archived once the reproduction lands. CLAUDE.md is *durable* (rules, paper-vs-code disagreements, Open opportunities, pointers to substrate) — it stays useful when the user comes back to do follow-on work in this directory. Every iteration picks up both on launch.
+
+## Setup: git-tracked workdir
+
+The reproduction's directory should be a git repo — if not already, `git init` it before launching the ralph loop. Every iteration commits its work as it goes — small, descriptive commits per significant change. The git log is the chronological trail of the reproduction; `git diff` is how the next iteration reads what landed.
+
+## The phases
+
+Eight phases (zero-indexed). ORIENT runs before the loop, in the user's main session; the loop's iterations carry phases 1–6; REVIEW runs after the loop closes, back in the user's main session.
+
+| # | Phase | Where it runs | Reference | Primary outputs |
+|---|---|---|---|---|
+| 0 | ORIENT | user's main session | [`references/orient.md`](references/orient.md) | per-paper `constitution.md` + `CLAUDE.md` + paper substrate at `work/reference/{paper.pdf, source/ or document.md, figures/, tables/, index.json, astra.yaml}` (from inline `/paper-extraction`) + code substrate at `work/reference/{code/, code-status.yaml, code-index.md}` (from inline `/lc-from-code` scan-only, when a repo exists) |
+| 1 | ARCHITECT | ralph iteration | [`references/architect.md`](references/architect.md) | stub `astra.yaml` at project root (sub-analyses, inputs, outputs, narrative) |
+| 2 | SPECIFY | ralph iteration | [`references/specify.md`](references/specify.md) | filled `astra.yaml` (`decisions:`, `findings:`, `prior_insights:` placeholders, anchored narrative); `targets/targets.md`; `implementation-notes.md`; `universes/baseline.yaml` |
+| 3 | LITERATURE | ralph iteration | [`references/literature.md`](references/literature.md) | `astra.yaml`'s `prior_insights:` Evidence entries each carry resolved `quote:` + `location:` selectors; per-paper PDFs cached via `astra paper add` |
+| 4 | IMPLEMENT | ralph iteration | [`references/implement.md`](references/implement.md) | `scripts/`, `requirements.txt`, recipes in `astra.yaml` |
+| 5 | RUN | ralph iteration | [`references/run.md`](references/run.md) | `results/<universe>/<output>/` |
+| 6 | COMPARE | ralph iteration | [`references/compare.md`](references/compare.md) | `comparison-report.{yaml,md}` |
+| 7 | REVIEW | user's main session | [`references/review.md`](references/review.md) | `REPRODUCTION-SUMMARY.md`, `/figure-comparison` HTML, resolved `open-questions.md`, finalized reproduction outcome |
+
+COMPARE produces a verdict plus an opportunity assessment — not just pass / fail, but where the gaps are, how much they likely matter, and how they sit relative to the constitution's fidelity intent. A subsequent iteration decides whether to spend another IMPLEMENT round (close a gap that sits below intent) or land the reproduction at its current trajectory and log the gap into CLAUDE.md's Open opportunities. Once the COMPARE → IMPLEMENT loop terminates (verdict `pass`, or `partial` with the un-acted opportunities logged), a subsequent cold-survey iteration finds nothing left to do and flips the constitution's `status:` to `closed`. The loop terminates; REVIEW runs in the user's main session.
+
+## The pre-loop bookend: ORIENT (Phase 0)
+
+The opening interactive phase. Run it from the user's main session. Read [`references/orient.md`](references/orient.md) in full before starting.
+
+ORIENT runs as one phase in **seven stages**:
+
+1. **Ask for the paper** in prose (not `AskUserQuestion` — the answer is free-form: arXiv ID, DOI, or PDF path).
+2. **Run `/paper-extraction <id>` inline** and read the substrate it produced — index.json, abstract, conclusions, data/code availability, acknowledgements. This grounds every subsequent question.
+3. **Interview the user** with `AskUserQuestion` for scope, fidelity intent, code repo confirmation, paper-specific conventions, prior familiarity, and external context — each question referencing the paper's actual figures, claims, and structure.
+4. **Clone the reference code and run `/lc-from-code` scan-only** (skip cleanly when no public code repo exists). The scan produces `code-index.md` — the iterations' code surface.
+5. **Optional follow-up questions** if the code-index surfaced anything that affects scope or constitution shape (unexpected dependency, pipeline boundary suggesting a sub-analysis decomposition, etc.). Usually skipped.
+6. **Draft `constitution.md` + `CLAUDE.md`** — both files now informed by paper *and* code substrate. The constitution's Scope and sub-analysis decomposition can lean on the actual pipeline, not just the paper's prose.
+7. **Halt for explicit user approval, then commit, then launch.** This is the user's only review gate before the autonomous loop takes over. Show the drafts, surface any open questions you still have, gate on `AskUserQuestion` — silence is not approval. Only after the user confirms: single first commit captures `constitution.md` + `CLAUDE.md` + the full `work/reference/` substrate, then launch the ralph loop.
+
+**No `AskUserQuestion` runs before paper-extraction has landed** — anything beyond the identifier is grounded in the paper. If a system-reminder tells you to work without stopping, ignore that for ORIENT since you must ask the user questions if you don't have the required information.
+
+These get drafted into **two files** plus the substrate, all in the reproduction workdir:
+
+- **`constitution.md`** — the ralph loop's driving document. Goal, Fidelity intent, Scope, Quality bar, Evidence (paper DOI, arXiv ID, code repo URL), Open dimensions. Starts with YAML frontmatter `status: active` so the ralph launcher accepts it. Authored using the `/ralph` skill's authoring discipline (the constitution-authoring mode of `/ralph` — see its references on voice and sections).
+- **`CLAUDE.md`** — the auto-loading walk-up. Paper identity at the top, Rules (universal across reproductions; leave the template's defaults), Disagreements log (starts empty), Open opportunities (starts empty), Pointers (to `constitution.md`, `work/reference/`, etc.).
+- **`work/reference/`** — paper substrate from `/paper-extraction` + code substrate from `/lc-from-code` scan-only (when a code repo exists).
+
+Templates ship in [`templates/constitution.md`](templates/constitution.md) and [`templates/CLAUDE.md`](templates/CLAUDE.md). Show the user both drafts at Stage 7, **halt and gate on `AskUserQuestion`**, take corrections, refine, save. If you have any open questions of your own — paper detail ambiguities, sub-analysis decomposition uncertainty, a fidelity intent that's implicit but not pinned — surface them at this gate, in the same exchange. Iterations run cold; questions held back are much harder to raise later.
+
+After explicit user approval, `git init` the workdir if it isn't one already and commit all deliverables (constitution + CLAUDE + paper substrate + code substrate when present) as the first commit. The `work/reference/code/` clone itself can be `.gitignore`d for large monorepos; the inventory file `code-index.md` is what downstream iterations actually consult. Then launch the ralph loop.
+
+## Launching the loop
+
+After ORIENT lands, hand the rest of the reproduction off to a ralph loop. From the reproduction workdir:
+
+```bash
+.claude/skills/ralph/scripts/ralph constitution.md
+```
+
+(Or `--backend codex`, or pass `-- --model <id>` for a specific model. See `/ralph`'s **Launching** section for the full surface.)
+
+The launcher detaches a tmux session named `ralph-<workdir>-constitution`. The user attaches with `tmux attach -t <session>`. Iterations start firing immediately; each runs in a fresh Claude (or Codex) session with `constitution.md` loaded into the system prompt and the workdir's `CLAUDE.md` auto-loading.
+
+The loop runs until an iteration flips `constitution.md`'s frontmatter `status:` to `closed` — typically after COMPARE returns `pass` (or `partial` with the un-acted opportunities logged) and the iteration that runs after that survey finds nothing left to do.
+
+Tell the user explicitly: "Launching the ralph loop in tmux session `<name>`. Attach with `tmux attach -t <name>`. Detach with the usual tmux prefix + `d`. The loop will run until the constitution closes (typically after COMPARE returns `pass`); at that point come back here and I'll run REVIEW close-out."
+
+## Per-iteration discipline
+
+Iterations follow the `/ralph` skill's Loop protocol — Survey → Work → Update → Exit. The per-paper specifics layered on top:
+
+- **Survey starts with the constitution + CLAUDE.md, then the workdir.** Read the constitution for Goal, Fidelity intent, Scope, Quality bar. Skim CLAUDE.md for rules, paper-vs-code disagreements, Open opportunities, and pointers. Then survey the workdir against the **Workdir-as-state** table below to identify the next phase that needs work — and read the most recent artifact critically before extending it.
+- **One phase per iteration is the typical shape.** Don't try to do ARCHITECT *and* SPECIFY in one iteration; the fresh-context property of the next iteration is what makes review work, and conflating phases collapses the seam. (Exceptions: small targeted fixes after COMPARE may touch multiple phases in one iteration if they're tightly coupled.)
+- **Phase reference is your working spec for the iteration.** Whichever phase is next, read its `references/<phase>.md` on entry. That file carries the discipline for that phase's work (what to produce, code-as-canonical, evidence shape, etc.).
+- **Read the most recent artifact critically as part of survey.** Every iteration enters fresh and reads the last phase's work cold. If you see real issues, fix them and commit before adding more — that's the review. If nothing needs fixing, advance to the next valuable move. Termination of any phase is implicit: a fresh-context iteration finds nothing to critique in the prior work and moves forward. The iteration that just landed fixes can't also be the iteration that judges the work clean — by construction, it found something to fix.
+- **Parallel fan-out lives inside an iteration.** LITERATURE Haiku quote-finders, SPECIFY per-sub-analysis work, IMPLEMENT per-output work — these fan out as one-level-deep `Agent(...)` spawns inside the iteration's main session. Sub-agents can't spawn sub-agents, but an iteration *is* the main session, so it can spawn freely.
+- **`AskUserQuestion` is not available inside an iteration.** Each iteration runs in a detached tmux session; the user isn't reachable interactively. Iterations append questions to `open-questions.md` with their best-judgment default applied, and the user resolves them at REVIEW close-out (back in their main session).
+- **Update the accumulators** before exit: in `CLAUDE.md`, the Paper-vs-code disagreements log for any material conflict the iteration surfaced and Open opportunities for any COMPARE-surfaced gap the iteration didn't act on; in `constitution.md`, Open dimensions for anything material that warrants user ratification at REVIEW.
+- **Sharpen the constitution body itself** if something fundamental shifted — the user's fidelity intent reframed, a sub-analysis decomposition rethought, a quality-bar item that's now more concrete. Don't accrete amendment sections; rewrite the affected prose.
+
+## Workdir-as-state
+
+Each iteration's survey reads the workdir to determine what phase is next. File existence implies the phase has been done:
+
+| Signal | Phase done |
+|---|---|
+| `constitution.md` + `CLAUDE.md` at workdir root, both committed, **and** `work/reference/{paper.pdf, source/ or document.md, index.json, astra.yaml}` present, **and** (`work/reference/code/` present **or** `code-status.yaml` records `found: false`) | ORIENT |
+| `astra.yaml` at project root validates with empty `decisions:` / `prior_insights:` / `findings:` blocks | ARCHITECT (stub) |
+| `astra.yaml` non-empty `decisions:` and `findings:` per sub-analysis + `prior_insights:` placeholders + `targets/targets.md` + `implementation-notes.md` | SPECIFY |
+| `astra.yaml`'s `prior_insights:` Evidence entries each carry resolved `quote:` + `location:` selectors; `work/cited/<doi-slug>/` populated per cited paper | LITERATURE |
+| recipes present in `astra.yaml` + `scripts/` + `requirements.txt` | IMPLEMENT |
+| `results/<universe>/<output>/` for every output | RUN |
+| `comparison-report.yaml` | COMPARE |
+| `REPRODUCTION-SUMMARY.md` + `.lightcone/comparison.html` + resolved `open-questions.md` | REVIEW |
+
+`git log --oneline` complements this — phase commits are the chronological view of what landed when, and iteration boundaries are visible in the log.
+
+## REVIEW close-out (after the loop)
+
+When the loop closes (the user reports back that the tmux session has exited, or `constitution.md`'s `status:` is `closed`), run REVIEW from the user's main session. See [`references/review.md`](references/review.md) for the full close-out: invoke `/figure-comparison` (mandatory) and optionally `/check-sentence-by-sentence`, walk `open-questions.md` with the user, draft `REPRODUCTION-SUMMARY.md`, propagate un-acted opportunities into CLAUDE.md, commit.
+
+REVIEW runs in your main session because `/figure-comparison` and `/check-sentence-by-sentence` both use `AskUserQuestion`, which isn't available inside ralph iterations.
+
+## Disciplines
+
+**Workdir is the state.** No state machine, no resume mechanic — file existence + `git log` + `astra validate` answer "what phase am I on" deterministically. Each iteration's first move is to survey the workdir on entry against the table above.
+
+**Constitution is task-bound; CLAUDE.md is durable.** The constitution describes what *this reproduction* is trying to achieve — Goal, Fidelity intent, Scope, Quality bar, Evidence, Open dimensions. Once the reproduction lands, the constitution can be archived. CLAUDE.md carries what stays useful past the reproduction — paper identity, rules, paper-vs-code disagreements, open opportunities for future tightening, pointers to substrate — so a user returning to this directory for follow-on work inherits it. When deciding where to put something new, ask: does it stay useful once the task is done?
+
+**Code-as-canonical, with disagreements recorded.** When the original codebase is at `work/reference/code/`, every iteration that touches a sub-analysis reads relevant code on entry. Where paper and code disagree on something material (a different choice would plausibly change a numeric result the paper reports), **code is canonical** for numerics, plotting, and method — but the disagreement is recorded: as a decision option in `astra.yaml` with both alternatives preserved, and as an entry in CLAUDE.md's *Paper-vs-code disagreements* section so it's visible to every iteration and to the user at REVIEW. Stylistic / cosmetic / pure-tooling differences aren't material — note them in `implementation-notes.md` and move on. Without this discipline, iterations drift to "looks right" rather than "matches" and material disagreements get silently absorbed.
+
+**Rigor is a trajectory toward the user's intent.** A reproduction isn't one-shot — it reaches a baseline, then accumulates. The anchor is the user's **fidelity intent**, captured in `constitution.md`'s Goal section at ORIENT as prose. Intent is partly aesthetic ("how good does this need to be?") and partly pragmatic ("what's feasible given the compute, tokens, and wall-clock available?"). Both dimensions belong in the prose — *"just checking the analysis is tractable — an afternoon"*, *"Figure 3 must be right; the rest can stay rough — overnight"*, *"every primary and secondary target lining up within stated tolerance, a few days"*.
+
+There's no explicit review state machine. Each iteration reads the prior phase's artifact critically as part of survey, fixes what needs fixing or advances if nothing does, commits, exits. The fresh-context property at iteration boundaries makes the next iteration the review. Gaps that the intent wants pushed further than the loop has time to deliver become Open opportunities in CLAUDE.md; a future loop relaunch closes them. (Work fan-out for the artifact-producing phases is separate; see "Parallel fan-out lives inside an iteration" above.)
+
+**arXiv-LaTeX-first acquisition.** When the paper is on arXiv, the source tarball is the substrate; equations, ligatures, captions, tables come through clean. PDF + Docling is a fallback for non-arXiv only.
+
+**Use the up-to-date `astra` CLI surfaces.** When `astra validate` already does the job, call it directly. Specifically: `astra validate <file>`, `astra validate --verify-evidence`, `astra paper add`. Use whatever the current `astra --help` surfaces — don't write skill-specific wrappers.
+
+**No synthetic data.** Unless the paper itself uses synthetic data as input, every input dataset must be real (downloaded, queried, or fetched from a real archive). The implement reference repeats this; treat it as load-bearing.
+
+**Open-questions accumulator.** Iterations run detached and can't reach the user interactively, so questions go to `<workdir>/open-questions.md` with the iteration's best-judgment default applied. The user resolves the accumulated questions at REVIEW close-out before the reproduction closes.
+
+## Resuming an in-flight reproduction
+
+When the user walks back into a workdir that already has artifacts:
+
+1. **Skip ORIENT** unless the user explicitly wants to revise scope (in which case edit `constitution.md` together, no re-draft from scratch).
+2. **If `constitution.md`'s `status:` is `active` and the tmux session isn't running**, re-launch the ralph loop: `.claude/skills/ralph/scripts/ralph constitution.md`. The next iteration surveys the workdir and picks up wherever the prior loop left off.
+3. **If `constitution.md`'s `status:` is `closed`**, the reproduction is at REVIEW. Run REVIEW close-out in your main session.
+4. **If ORIENT substrate is incomplete** — paper-extraction errored mid-flight, or the code clone / scan didn't land — finish the missing stages in your main session before launching the loop. Both `/paper-extraction` and `/lc-from-code` are survey-first and skip done work; re-invoking against partial state is safe.
+
+## Anti-patterns
+
+- **Auto-launching the ralph loop without an explicit user-approval gate.** Stage 7 halts. The user only sees the constitution + CLAUDE.md once before they go into a fresh iteration's system prompt; "drafts written → launch" skips the one editorial pass that gets to shape the entire reproduction. Gate on `AskUserQuestion`; treat silence as not-yet-approved.
+- **Spawning a "loop manager" sub-agent inside your main session.** The whole point of the ralph loop is fresh per-iteration context; you launch the loop, the loop runs detached, you come back when it's done. No nested orchestrator.
+- **Doing the long middle in your main session instead of launching the loop.** ORIENT belongs in your session; ARCHITECT through COMPARE belong in the loop. Doing phase work in your main session burns context that doesn't get reset; the loop exists precisely to give each phase fresh context.
+- **Asking an iteration to use `AskUserQuestion`.** Iterations run detached. Surface questions to `open-questions.md` with a default applied; the user resolves at REVIEW.
+- **Re-implementing what `astra` already does.** If `astra validate` returns clean, don't write a separate validator. If `astra paper add` caches the PDF, don't write a separate cache.
+- **Bundling phases into one iteration.** Each iteration does one phase's worth of work. Conflating phases re-creates the failure mode the loop exists to avoid: no fresh-context review between phases.
+- **Accreting amendment sections in `constitution.md`.** When something fundamental shifts, *reshape* the affected prose. The chronology lives in commits; the body lives in *now*.
diff --git a/claude/lightcone/skills/lc-from-paper/references/architect.md b/claude/lightcone/skills/lc-from-paper/references/architect.md
new file mode 100644
index 00000000..a2556e77
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/architect.md
@@ -0,0 +1,112 @@
+# ARCHITECT — write the stub `astra.yaml`
+
+ARCHITECT is the structural seam: decide the sub-analysis decomposition, wire the inputs and outputs at the sub-analysis level, and author high-level narrative prose for each analysis — all in one stub `astra.yaml`. SPECIFY then fills the stub with `decisions:`, `prior_insights:`, `findings:`, and `astra-anchor:` references. Splitting **structure** from **content** keeps each iteration's cognitive load manageable: ARCHITECT decides *what the analyses are*; SPECIFY decides *what's inside each one*.
+
+ARCHITECT is what a ralph iteration does when the workdir signals "ORIENT substrate present + project-root `astra.yaml` absent (or empty stub)." The heavy work of *understanding* the paper and code happened in `/paper-extraction` and `/lc-from-code`'s scan-only branch — both invoked inline during ORIENT in the user's main session. Their on-disk substrate (the structural `index.json`, the paper-extraction `astra.yaml`, the `code-index.md`) is what you read on entry. No persistent expert sub-agents; targeted reads against the substrate carry the orientation.
+
+## Inputs
+
+- `constitution.md` — Goal, Fidelity intent, Scope, Quality bar. Read first; the Goal's intended replication targets fence what `outputs:` belong in the stub.
+- `CLAUDE.md` — auto-loaded; Rules + accumulators (still empty at this point).
+- `work/reference/index.json` — paper-side structural index from `/paper-extraction` (figures, tables, section outline with line numbers, citations with resolved DOIs).
+- `work/reference/astra.yaml` — paper-extraction's ASTRA-shape stub of the paper itself: id, name, `narrative.summary` (from abstract), optionally `findings:` (paper's claimed numerical results).
+- `work/reference/code-index.md` — code-side inventory from `/lc-from-code`'s scan: script inventory, candidate decisions with `file:line` refs, module map, entry-points, external data dependencies, container hints.
+- `work/reference/source/` (Path A) or `work/reference/document.md` (Path B) — paper text. Grep into for specific facts; do not re-read whole.
+- `work/reference/code/` (when present) — the cloned reference code. Read targeted modules when `code-index.md` doesn't answer a structural question.
+- `work/notes/notes.md` — user-supplied prior notes, if any.
+
+## Outputs
+
+- `astra.yaml` at the project root — **stub form**: sub-analyses named, architecture wired (inputs / outputs declared at the sub-analysis level), high-level `narrative:` prose blocks per analysis. **No `decisions:`, `prior_insights:`, `findings:`, or `astra-anchor:` references yet** — those entries don't exist for the narrative to reference.
+- `constitution.md` updates: Open dimensions, when something material surfaces that warrants user ratification at REVIEW.
+
+## Step 1 — Read the substrate, then write the stub
+
+Read `constitution.md`, `CLAUDE.md`, `work/reference/index.json`, `work/reference/code-index.md`, and the paper-extraction `astra.yaml` first. Then for anything the indices don't answer, Grep into `work/reference/source/` (Path A) or `document.md` (Path B), or read targeted modules in `work/reference/code/`. Don't try to absorb the paper or code whole; the indices give you the orientation, and targeted reads fill in specifics.
+
+### What to do
+
+1. **Reconcile sub-analysis decompositions.** Read `code-index.md`'s natural-decomposition section and `index.json`'s section outline. Where paper and code agree on a stage, use that name (noun-phrase, e.g. `reconstruction`). Where they disagree, **code's structure is canonical for stage boundaries** — the paper compresses; the code reveals the actual decomposition. Where code is absent or thin, follow the paper alone. Where module boundaries are genuinely ambiguous, read the relevant modules under `work/reference/code/` to settle it.
+2. **Choose: one analysis or sub-analyses?** If the paper has only one stage end-to-end (no clean intermediate handoffs), write a single analysis. If it has genuinely independent stages (each stage's output flows as the next's input), write sub-analyses. Sub-analysis IDs must be noun phrases: `reconstruction`, `clustering`, `bao_fit`. Avoid reserved names: `inputs`, `outputs`, `decisions`, `findings`, `prior_insights`, `analyses`, `options`, `content`, `narrative`.
+3. **Wire inputs and outputs at the sub-analysis level.** For each sub-analysis:
+   - Declare `inputs:` from `code-index.md`'s External-data-dependencies plus any paper-named external datasets. The depth (acquisition path, selection criteria) is SPECIFY's; ARCHITECT names the input and gives it a stable id.
+   - Declare `outputs:` matching the result loci from `index.json` (figures + tables) plus any intermediate artifacts a downstream sub-analysis consumes. Tag each output's `priority:` from the paper's emphasis (primary / secondary). **The reproduction's targeted scope from `constitution.md`'s Scope takes precedence** — if the user only wants Figure 3 and Table 2, only those land as `outputs:`; the rest are out-of-scope and noted as such.
+4. **Author the root and per-analysis narrative.** Invoke `/narrative` for prose authoring (it carries the discipline on reserved names, voice, the data-flow paragraph requirement). High-level prose only — **no `astra-anchor:` references yet**, because the entries those would point at don't exist. SPECIFY will weave in anchors as it authors `decisions:` / `prior_insights:` / `findings:` per sub-analysis. The root `narrative:` MUST include a top-down end-to-end data-flow paragraph (per the narrative skill's data-flow rules) when sub-analyses exist.
+5. **Validate.** `astra validate astra.yaml` must return clean — even with empty `decisions:` / `prior_insights:` / `findings:` blocks, the structural fields and narrative prose must pass schema checks.
+
+### Stub shape — what `astra.yaml` looks like after ARCHITECT
+
+```yaml
+# Stub: structure + narrative. SPECIFY fills decisions/findings/prior_insights and weaves astra-anchor references into the narrative.
+id: <paper-slug>
+title: "<paper title>"
+doi: <doi>
+
+narrative:
+  summary: |
+    <high-level paragraph for the root analysis>
+  methods: |
+    <data-flow paragraph; required when sub-analyses exist>
+
+analyses:
+  <sub-analysis-id-1>:
+    narrative:
+      summary: |
+        <prose for this sub-analysis>
+    inputs:
+      <input-id>:
+        <stable name; depth lives in SPECIFY>
+    outputs:
+      <output-id>:
+        type: figure | table | metric | data-product
+        priority: primary | secondary
+        description: |
+          <one-line on what this output is>
+    decisions: {}      # SPECIFY fills
+    prior_insights: {} # SPECIFY records placeholders (Evidence with doi:, no quote: yet), LITERATURE fills the quote: selectors
+    findings: {}       # SPECIFY fills
+
+  <sub-analysis-id-2>:
+    ...
+```
+
+### Rules for Step 1
+
+- **Stub, not snapshot.** Don't try to author content for `decisions:`, `prior_insights:`, `findings:`. Those go in SPECIFY. Your job is the structural skeleton.
+- **Reserved names.** Sub-analysis IDs are noun phrases; avoid the reserved set. Each ID must be unique across the spec.
+- **Code-as-canonical for structure.** Where paper and code disagree on the decomposition, the code's structure is canonical (the paper compresses for narrative; the code reveals real seams).
+- **Targeted scope wins.** `constitution.md`'s Scope fences the reproduction. If the user only wants Figures 3–4 plus Table 2, only those land as `outputs:`.
+- **Narrative prose, no anchors.** Author `narrative:` prose at root and per-sub-analysis levels. Do NOT add `astra-anchor:` references — the entries those would point at don't exist yet.
+- **Validate before exit.** `astra validate astra.yaml` must return clean.
+- **Targeted reads, not whole-paper absorption.** The indices give you most of what you need; reach into the source / document / code for specific items, not as a default.
+
+After the stub is written and validates, commit it (`architect: stub astra.yaml`) and exit.
+
+## Reviewing prior ARCHITECT work as part of survey
+
+There is no separate review phase. Every iteration that enters and finds an ARCHITECT stub on disk reads it critically before doing anything else. If you see real issues — wrong sub-analysis decomposition, reserved-name collision, missing in-scope output, narrative gap — fix them inline, commit (`architect: fix <what>`), and exit. Only when a fresh-context read finds nothing to fix does the iteration move on to SPECIFY work. The fresh-context property at iteration boundaries makes the next iteration the review; nothing else is needed.
+
+What to look at:
+
+1. **Sub-analysis decomposition.** Right cuts? Consistent with `code-index.md`? Defensible against the paper where the paper compresses?
+2. **Sub-analysis IDs.** Noun phrases. No reserved-name collisions (`inputs`, `outputs`, `decisions`, `findings`, `prior_insights`, `analyses`, `options`, `content`, `narrative`).
+3. **Inputs at sub-analysis level.** Each input has a stable id; the data dependency is real (cross-check against `code-index.md`'s External-data-dependencies and the paper's data section).
+4. **Outputs at sub-analysis level.** Each output corresponds to a result locus from `index.json` OR an intermediate artifact a downstream sub-analysis consumes. Targeted scope from `constitution.md`'s Scope is honored — no out-of-scope outputs sneaking in, no in-scope targets missed.
+5. **Narrative coverage.** Root narrative includes a data-flow paragraph (when sub-analyses exist). Each sub-analysis's narrative accurately describes its role. No `astra-anchor:` references at this stage.
+6. **Validates.** `astra validate astra.yaml` returns clean.
+
+Don't flag empty `decisions:` / `prior_insights:` / `findings:` — that's SPECIFY's territory. Don't re-read the entire paper or code; use the indices and targeted reads. If you see the same artifact getting churned across many recent commits without convergence, log the situation to `open-questions.md` and advance the phase anyway.
+
+## Survey signals (entry into ARCHITECT)
+
+- `work/reference/index.json` + `work/reference/astra.yaml` + `work/reference/code-index.md` (when code present) exist ⇒ ORIENT substrate is ready
+- `astra.yaml` at project root absent (or present-but-empty) ⇒ this iteration writes the stub
+- `astra.yaml` exists with stub form (sub-analyses + inputs + outputs + narrative populated; `decisions:` / `prior_insights:` / `findings:` blocks present-and-empty) ⇒ ARCHITECT's output is on disk; read it critically. Fix anything wrong; otherwise the iteration moves on to SPECIFY.
+
+## Notes
+
+- **No persistent expert sub-agents.** The on-disk substrate (`index.json`, `code-index.md`, the paper-extraction `astra.yaml`) carries the orientation iterations need; re-read what you need on entry.
+- **The stub's empty blocks are intentional.** `decisions: {}`, `prior_insights: {}`, `findings: {}` make it clear at a glance that ARCHITECT's job is structural and SPECIFY fills them. Don't try to half-author content — empty is honest.
+- **Code-as-canonical for structure, paper-as-canonical for narrative voice.** The code reveals where the real stage boundaries are; the paper provides the words to describe them. The stub uses both.
+- **The narrative skill is the prose author, not the structure author.** Invoke `/narrative` for the prose blocks; ARCHITECT's job is the structural skeleton plus invoking `/narrative` to fill the `narrative:` keys cleanly.
+- **Commit each artifact as it lands.** The stub commits when it lands; each subsequent fix pass commits separately. Small, descriptive commits keep `git log` legible to the next iteration.
diff --git a/claude/lightcone/skills/lc-from-paper/references/compare.md b/claude/lightcone/skills/lc-from-paper/references/compare.md
new file mode 100644
index 00000000..2dd64106
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/compare.md
@@ -0,0 +1,118 @@
+# COMPARE — judge the match, name the opportunities
+
+Compare reproduced results against the paper's replication targets. COMPARE returns two things: a **verdict** (pass / partial / fail) and an **opportunity assessment** — where the gaps are, how much they likely matter, and how they sit relative to the user's fidelity intent in `constitution.md`'s Goal section. The verdict drives whether a subsequent iteration retries IMPLEMENT; the opportunity assessment tells the next iteration (and the user at REVIEW) which gaps fall below intent and would be high-leverage to close, even on `pass`. Together they replace the old yes/no framing.
+
+COMPARE is what a ralph iteration does when the workdir signals "RUN done (`results/` materialized) + `comparison-report.yaml` absent or stale relative to latest RUN." The iteration writes the report; what happens next depends on the verdict and the iteration's read of the constitution's Fidelity intent. If verdict is `partial`/`fail` AND an opportunity is below intent AND attempt budget remains, the next iteration takes a retry attempt at IMPLEMENT against the failing outputs first. If verdict is `pass` AND no opportunities are below intent (or budget is exhausted), the iteration logs un-acted opportunities into CLAUDE.md's *Open opportunities*; a subsequent cold-survey iteration with no contributions closes the constitution and REVIEW runs in the user's main session.
+
+## Inputs
+
+- `targets/targets.md` — target ledger with priorities, expected values, comparison guidance
+- `astra.yaml` — output definitions (each target maps to an output)
+- `targets/` — reference figures / tables for comparison
+- `results/<universe>/<output_id>/` — reproduced results
+- `work/reference/source/` (Path A) or `work/reference/document.md` (Path B) — target paper text. Grep into for "what does the paper actually claim for this number" or "how does the paper describe what Figure 3 should show" when grading the comparison.
+- `work/reference/code/` (when present) — read targeted modules pointed at by `code-index.md` for diagnosing divergence: "what does the reference code compute here that ours might miss".
+
+## Outputs
+
+- `comparison-report.yaml` — structured verdict
+- `comparison-report.md` — human-readable summary
+
+## Result path convention
+
+For an output with `id: X`, the reproduced result lives at `results/<universe_id>/X.<ext>`:
+
+- metrics: `.json` containing `{"value": ...}`
+- figures: `.png`
+- tables: `.csv`
+
+## Task
+
+1. **Read `targets/targets.md`.** Every replication target with its priority, expected values, comparison guidance, and the path to its reference file in `targets/`.
+2. **Read `astra.yaml`.** Outputs correspond to targets. Match each target to its output.
+3. **For every target**, find its reproduced result in `results/<universe_id>/` and compare against the reference file in `targets/`. Missing results are `match: false`.
+4. **Write `comparison-report.yaml` and `comparison-report.md`.**
+
+## Comparison guidance
+
+**Metrics.** Judge whether the reproduced value is scientifically equivalent to the expected value from `targets/targets.md`. Numerical tolerance comes from the target's stated precision; bare match is not the bar.
+
+**Figures.** Read the reference figure from `targets/` and compare to the reproduced image. Focus on shape / trend, axis ranges, key features (peaks, inflections, curve ordering), and magnitudes. **Do NOT require pixel-perfect matches** — stochastic methods produce variation. Judge whether the same scientific conclusion follows from both figures.
+
+**Tables.** Compare key values noted in `targets/targets.md` first, then remaining values. Reference tables are in `targets/`.
+
+## Output: `comparison-report.yaml`
+
+```yaml
+verdict: pass|partial|fail
+attempt: <attempt_number>
+outputs:
+  <output_id>:
+    type: metric|figure|table
+    priority: primary|secondary
+    paper_value: "<from targets/targets.md>"
+    reproduced_value: "<from results>"
+    reference_file: "<path in targets/>"
+    reproduced_file: "<results/...>"
+    match: true|false
+    notes: "<what matches, what differs>"
+failure_diagnosis: null|"<root cause>"
+fix_suggestions:
+  - "<specific actionable suggestion with script and line number>"
+opportunities:
+  - area: "<which output / sub-analysis / decision>"
+    gap: "<what could be tightened — even if the target matched>"
+    leverage: "<rough sense of impact: 'changes headline number by ~10%' / 'cosmetic only' / 'unknown'>"
+    fix_pointer: "<where the fix would land — script:line, decision id, or implementation-notes section>"
+    relative_to_intent: above|at|below
+```
+
+## Verdict rules
+
+- **`pass`**: ALL primary targets match, no major issues with secondary targets.
+- **`partial`**: some primary targets match, or all primary match but secondary has issues.
+- **`fail`**: most primary targets don't match, or fundamental methodological issue.
+
+If verdict is not `pass`, **`fix_suggestions` MUST reference specific scripts and line numbers**. "The result is wrong" is not actionable; "scripts/bao_fit.py:42 uses `damping_prior=flat`, paper specifies Gaussian; change to gaussian per Howlett+2017 §4.2" is.
+
+## Opportunity assessment rules
+
+The `opportunities:` block surfaces **gaps that didn't necessarily fail the verdict but would be high-leverage to close**. Examples worth flagging:
+
+- A primary-target match was within tolerance but the underlying method is a sketch (e.g. simplified noise model that happens to land in the right range — tightening it would change the headline by O(10%)).
+- A secondary target failed but is plausibly fixable from the same root cause as a primary that passed (one fix, two outputs).
+- A decision SPECIFY recorded with code-as-canonical that has an unresolved disagreement still in `open-questions.md` and could move the result.
+- A sub-analysis whose evidence quotes are paraphrased rather than verbatim (would fail `--verify-evidence` if pushed harder).
+
+Each opportunity gets two grades: a **leverage** one-liner (impact if closed) and a **relative_to_intent** placement against the user's fidelity intent in `constitution.md`'s Goal section:
+
+- `below` — the user's intent calls for tighter than this; closing the gap moves the reproduction toward what they actually want.
+- `at` — closing the gap reaches the intent; further tightening would be gravy.
+- `above` — already past the intent; log it but it doesn't pull on attention.
+
+Read the Goal's fidelity intent prose to make the call. "Figure 3 must be right" + a rough figure 3 systematics = `below`. "Just checking the analysis is tractable" + a tight outputs block + a rough sub-analysis = `above` everywhere except the headline. When intent is silent on something, default to `at` for primary targets, `above` for secondaries.
+
+Empty `opportunities:` is a strong signal — say "the reproduction reaches the fidelity intent across the targets" rather than padding.
+
+Also write `comparison-report.md` with a human-readable summary. For figure / table comparisons, describe what you see in both and explain your match judgment. Include the opportunity assessment as its own section — group by `relative_to_intent` so the `below` items lead.
+
+## Verdict + opportunity surfacing
+
+After writing the report, the iteration acts against the fidelity intent (iterations run detached; the user isn't reachable interactively):
+
+- If attempt < budget AND (verdict is `partial` / `fail` OR any opportunity is `below` intent), commit the report, exit. The next iteration surveys, sees the report's `below`-intent opportunities, and takes a retry attempt at IMPLEMENT targeting those gaps first.
+- If verdict is `pass` AND no opportunities are `below` intent, OR attempt budget is exhausted, log un-acted opportunities into CLAUDE.md's *Open opportunities* list, commit. A subsequent cold-survey iteration (no contributions) closes the constitution by flipping `status:` to `closed`, and REVIEW close-out runs in the user's main session.
+
+The verdict is the iteration's judgment from the data; the **decision to keep iterating or close** happens by iteration boundary — one iteration writes the report and the take, the next surveys and decides whether to retry or accept. The opportunity assessment — graded against the user's fidelity intent — is the bridge that turns a binary verdict into a picture the next iteration (and REVIEW) can navigate.
+
+## Survey signals (entry into COMPARE)
+
+- All outputs in `lc status --universe baseline` are `ok` ⇒ ready to compare
+- `comparison-report.yaml` exists with current `attempt` ⇒ COMPARE done for this attempt
+- `comparison-report.yaml` verdict is `pass` (or `partial` with un-acted opportunities logged into CLAUDE.md's Open opportunities) ⇒ COMPARE → IMPLEMENT loop terminated; the next cold-survey iteration closes the constitution and REVIEW runs in the user's main session
+
+## Notes
+
+- **One COMPARE per IMPLEMENT.** Each IMPLEMENT retry produces a fresh COMPARE; the report's `attempt` field increments. Do not overwrite prior reports — keep them at `comparison-report-attempt-<N>.yaml` if useful, or commit each between attempts so `git log` carries the history.
+- **The verdict is the iteration's judgment from the data; the keep-iterating decision happens at iteration boundary.** One iteration writes the report and the take on what should happen next; the next iteration surveys, reads the take, and either retries or accepts. The user's voice enters at REVIEW close-out, not mid-loop.
+- **The opportunity assessment stays accessible past close-out.** Un-acted-on opportunities sit in CLAUDE.md's *Open opportunities* list — durable, auto-loaded on any future Claude Code session in this workdir. Tightening any becomes a future IMPLEMENT pass against a clearer target.
diff --git a/claude/lightcone/skills/lc-from-paper/references/implement.md b/claude/lightcone/skills/lc-from-paper/references/implement.md
new file mode 100644
index 00000000..7822e346
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/implement.md
@@ -0,0 +1,109 @@
+# IMPLEMENT — write scripts and recipes
+
+Read `astra.yaml` (the filled spec) and `implementation-notes.md` (practical guidance). Write scripts in `scripts/` that produce each output, then add recipes to `astra.yaml` so the asset graph is wired end to end. After the first-pass implementation lands, the next fresh-context iteration reads it critically against paper + code; if it sees issues it fixes them and exits, otherwise it advances to RUN. Same shape ARCHITECT and SPECIFY use.
+
+IMPLEMENT is what a ralph iteration does when the workdir signals "SPECIFY done + scripts/ absent (first pass) or comparison-report.yaml shows partial/fail (retry pass)". Most implementation is mechanical (translate spec → script). Where parallelization is feasible (multiple independent outputs from different scripts), the iteration fans out to one-level-deep sub-agents per output (inside its own main session) and merges.
+
+## Inputs
+
+- `astra.yaml` — the filled spec (sub-analyses, decisions, prior_insights, findings, narrative — all populated by SPECIFY)
+- `implementation-notes.md` — tricky algorithms, numerical gotchas, data-format quirks
+- `work/reference/index.json` — paper-side structural index (figures, tables, outline, citations); useful when the spec compresses or you need to find where in the paper a behavior is described.
+- `work/reference/code-index.md` (when code present) — code inventory: module map, candidate decisions with file:line, entry-points, data dependencies, gotchas (the canonical map of where each sub-analysis's logic lives in `work/reference/code/`).
+- `work/reference/code/` (if present) — **canonical reference. Read it when implementing each output.** Where paper and code disagree, code wins for numerics, plotting, and method.
+- `constitution.md` — Fidelity intent.
+- `CLAUDE.md` — **Paper-vs-code disagreements** for prior conflicts already logged.
+
+## Outputs
+
+- `scripts/<output>.py` (or `.sh`, or whatever fits) — one script per output (or shared scripts for tightly-coupled outputs)
+- `requirements.txt` — Python dependencies
+- Recipes in `astra.yaml` — each output gets a `recipe:` block with `command:` and `inputs:`
+- `CLAUDE.md` updates — append to **Paper-vs-code disagreements** for any new conflict surfaced during implementation
+
+## Step 1: write recipes + scripts
+
+Read `astra.yaml` and `implementation-notes.md`. For each output, write a script in `scripts/` that produces it, and add a `recipe:` block to the output's entry in `astra.yaml` with `command:` and `inputs:`.
+
+### With a code reference (`work/reference/code/` exists)
+
+**Read the relevant code when implementing each output** — not just to resolve ambiguities but as the canonical source of truth for numerics + method. Write clean scripts following ASTRA conventions (not verbatim copies), but treat the code's behavior as authoritative when it disagrees with the paper. When you encounter a paper-vs-code disagreement that SPECIFY's code pass missed: continue with the code's behavior (per the canonical-resolution default; the iteration runs detached, no interactive ratification), append the disagreement to CLAUDE.md's **Paper-vs-code disagreements** AND `open-questions.md`, and note it in `implementation-notes.md` so REVIEW close-out can ratify or override.
+
+Without this discipline, the implementation drifts to "looks right" rather than "matches" — the failure mode the first-paper test surfaced.
+
+When the reference code is substantial enough that implementation is really a migration of an existing codebase, follow `/lc-from-code`'s migration workflow in **augment existing ASTRA** mode. Use its code scan, minimal parameter-plumbing, dependency/container, and baseline-preservation strategies, but apply them to this reproduction's existing `astra.yaml`. Do not create a second ASTRA project or duplicate the spec; add recipes, code-backed options, implementation notes, and missing structure to the current reproduction artifact.
+
+### Without a code reference (`work/reference/code/` is absent)
+
+When `code-status.yaml` records `found: false` or the cloned repo turned out to be unusable, there is no canonical code substrate to anchor against. **Write the implementation fresh from the spec** — `astra.yaml`'s decisions, findings, and prior_insights are now the only source of method-level truth, and the paper's prose (Grep into `work/reference/source/` or `document.md` for specific facts) is the source of numerics-level truth. Don't pretend a code reference exists; don't try to find a similar paper's code as a stand-in. Implement what the spec describes, read targeted paper sections when the spec compresses something you need clarified, and rely on COMPARE to surface anywhere the implementation has drifted from the paper's claims.
+
+The code-as-canonical rule does not apply here — there is no code to be canonical. The paper is the only anchor. This is the harder path; reproductions on it converge slower and have more open questions for REVIEW close-out. Surface that honestly to the user as you go; don't dress up paper-only implementations as if they had a code anchor.
+
+### Parallelize where feasible
+
+When outputs are produced by independent scripts (no shared expensive computation), the iteration spawns one-level-deep sub-agents per output (inside its own main session). Each sub-agent gets:
+
+- The output's spec entry from `astra.yaml` (including its sub-analysis's `decisions:` / `findings:` for context)
+- The relevant section of `implementation-notes.md`
+- The matching entry in `work/reference/code-index.md`'s natural-decomposition / entry-points block — that's the pointer back to the canonical code location for the sub-analysis the output lives in
+- The relevant code path(s) under `work/reference/code/`
+
+The iteration merges scripts and recipes after the per-output sub-agents finish. Tightly-coupled outputs (e.g. an MCMC producing both a chain and a summary statistic) stay in one sub-agent and one script.
+
+### Rules for the first pass
+
+1. **One script per output** (or a shared script for tightly-coupled outputs).
+2. **Parameterize by decisions.** Each decision is a CLI argument; scripts also receive `--universe <universe_id>`. See lightcone-cli's `CLAUDE.md` for the full convention.
+3. **Add recipes** to each output in `astra.yaml` with `command:` and `inputs:` (dependencies). Recipe inputs use the same `<analysis>.<output>` form the narrative skill's data-flow rules require.
+4. **Create `requirements.txt`** with needed packages. Do not install them — the RUN phase manages environments.
+5. **Do not execute scripts** — the RUN phase handles execution via `lc run`.
+6. **Validate** with `astra validate astra.yaml` after adding recipes.
+
+## Step 2: reviewing prior IMPLEMENT work as part of survey
+
+There is no separate review phase. Every iteration that enters and finds `scripts/` + recipes on disk reads them critically against paper + code before doing anything else. If you see real issues — wrong constant, missing recipe, paper-vs-code drift, synthetic-data shortcut — fix them inline, commit (`implement: fix <what>`), exit. When a fresh-context read finds nothing to fix, the iteration advances to RUN.
+
+The cross-check question on entry: is the implementation consistent with the paper and the code?
+
+### What to look at
+
+1. **Recipe coverage.** Every output in `astra.yaml` has a recipe; every recipe runs a script that exists in `scripts/`.
+2. **Method fidelity.** For each output, the script implements the method described by the relevant sub-analysis's `decisions:` and `findings:` in `astra.yaml`. Where SPECIFY's code pass surfaced a material disagreement, the script follows the code's method (canonical-resolution rule), unless the spec recorded a different override in `decisions:` and `universes/baseline.yaml`.
+3. **Numerical correctness.** Constants, hyperparameters, threshold values match the paper (or the code, where the canonical-resolution rule applied). Flag mismatches with `path:line` of the script and the paper §/eq + the relevant `astra.yaml#analyses.<sub-id>.decisions.<key>` entry.
+4. **Data acquisition.** Scripts that fetch data use the real acquisition path from `astra.yaml`'s inputs — no synthetic / mock substitutes.
+5. **Determinism.** Scripts set random seeds where the paper's method is stochastic. Library versions in `requirements.txt` are pinned where reproducibility requires it.
+6. **Recipe wiring.** Recipe `inputs:` references match the data-flow the scripts actually consume; no orphan dependencies, no missing dependencies.
+
+Apply fixes inline as you find them — `scripts/`, `astra.yaml` recipes, `requirements.txt`, `implementation-notes.md`, the disagreements log in CLAUDE.md when a new material conflict surfaces. After any change to `astra.yaml`, run `astra validate astra.yaml`. Commit the diff and exit.
+
+Don't re-read the entire paper; grep into `work/reference/index.json`, `work/reference/code-index.md`, and `work/reference/source/` (or `document.md`) for specific items. Don't declare the implementation done in the same iteration where you landed fixes — the next fresh-context iteration reads it cold; if nothing needs fixing, it advances to RUN, which is the "done" signal.
+
+The post-RUN COMPARE → IMPLEMENT retry loop is separate from this critical-read pattern — that loop handles result-matching after the pipeline executes, not spec/implementation alignment before it.
+
+## Data: REAL DATA ONLY
+
+**NEVER generate synthetic, mock, or fake data.** Every input dataset must be downloaded or queried from its real source (archive URL, database query, API, etc.). The methodology notes and `astra.yaml` inputs describe where each dataset comes from — write scripts that fetch the actual data.
+
+The only exception is if the paper itself uses synthetic / simulated data as its input (e.g., N-body simulations, Monte Carlo samples). In that case, reproduce the paper's data generation procedure exactly as described — but this is reproducing the paper's methodology, not substituting real data with fakes.
+
+If a dataset is behind a paywall, requires registration, or is "available upon request," write the download script with a clear error message explaining what the user needs to do manually. **Do NOT substitute synthetic data as a workaround.**
+
+## Retry attempts (post-COMPARE)
+
+If `comparison-report.yaml` exists from a prior COMPARE that returned `partial` or `fail`, a subsequent iteration may take on a **retry attempt**. Read `comparison-report.yaml` to understand what went wrong; focus on the outputs marked as non-matching. Default attempt budget is 5; the iteration's first move is to check whether `attempt` in the report has reached the budget. If it has, accept partial, log the failure as an Open opportunity in CLAUDE.md (so REVIEW close-out can decide whether to push further or accept the trajectory), and exit; subsequent iterations either accept the verdict via a cold close or pivot scope based on REVIEW's input.
+
+A retry attempt restarts the critical-read pattern on the changed scripts before the next iteration advances to RUN.
+
+## Survey signals (entry into IMPLEMENT)
+
+- `astra.yaml` validates and `implementation-notes.md` exists ⇒ ready to implement first pass
+- `scripts/` has one entry per output id; `requirements.txt` exists; recipes appear in `astra.yaml` ⇒ IMPLEMENT's output is on disk; read it critically. Fix anything wrong; otherwise the iteration advances to RUN.
+- `comparison-report.yaml` returns `pass` ⇒ COMPARE → IMPLEMENT loop terminated; the constitution can close after a cold survey, and REVIEW close-out runs in the user's main session
+
+## Notes
+
+- **`lc run` is the canonical execution surface.** Scripts assume they will be invoked via the lightcone-cli runner. Do not hard-code working directories or assume environment activation.
+- **Determinism where possible.** Set random seeds, fix library versions, prefer reproducible installations. The IMPLEMENT goal is not just "produces output once" but "reproducibly produces output across runs."
+- **Tight coupling earns shared scripts.** When two outputs come from the same expensive computation (e.g. an MCMC produces both a parameter chain and a summary statistic), one script with multiple output paths is cleaner than two scripts that each re-do the work.
+- **The iteration that fixed the artifact can't also be the iteration that judges it clean.** That's the fresh-context-no-bias property at iteration boundaries; conflating fix-iteration with done-judgment defeats it.
+- **Commit as you go.** One commit per script + recipe wiring; one commit per fix. The next iteration reads `git log` to track progress.
diff --git a/claude/lightcone/skills/lc-from-paper/references/literature.md b/claude/lightcone/skills/lc-from-paper/references/literature.md
new file mode 100644
index 00000000..5c49f61f
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/literature.md
@@ -0,0 +1,199 @@
+# LITERATURE — resolve `prior_insights:` placeholders against the cited papers
+
+After SPECIFY records each citation marker as a `prior_insights:` *placeholder* — a syntactically-complete `Insight` (`id`, `claim`, `created_at`, `evidence: [{id, doi}]`) whose Evidence entry carries the cited paper's DOI but **no `quote:` selector yet** — LITERATURE stands up each cited paper's reading materials, finds the verbatim quote in the cited paper that justifies the placeholder's claim, and writes the resolved `quote: {exact, prefix, suffix}` (+ `location: {page: N}`) onto that Evidence entry. The decision↔insight linkage already lives on the option side (`Option.insights: [<insight_id>, ...]`); LITERATURE doesn't touch it — only the Evidence's `quote:` / `location:`. After LITERATURE, every `prior_insights:` Evidence entry has a verified quote; `astra validate astra.yaml --verify-evidence` returns clean.
+
+The quote-finding direction is: **target paper's claim → quote inside the cited paper**. The target paper says "we follow Smith+20's magnitude cut of i<24"; LITERATURE goes to Smith+20 and finds the verbatim quote there that justifies that statement ("we adopt a magnitude cut of i<24 as our fiducial selection"). The point is to verify the target paper's claims about its predecessors are real, not paraphrased or misremembered.
+
+LITERATURE runs **after SPECIFY**, not before — relevant `prior_insights:` are defined by the decisions and findings they justify. Fetching cited papers speculatively before SPECIFY would do work for citations that may never end up needed.
+
+LITERATURE is what a ralph iteration does when the workdir signals "SPECIFY done + `prior_insights:` placeholders present whose Evidence entries carry `doi:` but no `quote:` selector yet." Its internal architecture is **two simple stages**: mechanical fetch (paper-extraction's deterministic script, batched-parallel via shell — no agent fan-out), then quote-finding (the iteration does it itself for small placeholder counts; spawns a small number of Haiku sub-agents inside its own main session for large counts). The agentic work is the quote-matching; the fetch is plumbing.
+
+## Inputs
+
+- `astra.yaml` — filled by SPECIFY's paper (and code) passes; each sub-analysis has `prior_insights:` entries shaped as syntactically-complete `Insight` blocks (`id`, `claim`, `created_at`, `evidence: [{id, doi}]`) where each Evidence carries a `doi:` but no `quote:` selector. These are the placeholders LITERATURE resolves by writing `quote: {exact, prefix, suffix}` and `location: {page}` onto each Evidence entry. The option↔insight linkage already lives on the option side (`Option.insights`); LITERATURE does not touch it.
+- `work/reference/index.json#citations` — paper-extraction's cite-key → `{locations, citation, doi}` mapping for every entry in the target paper's bibliography. Used as the canonical cite-key → DOI lookup when cross-checking placeholder DOIs and surfacing unresolved-DOI cases.
+- `work/reference/source/` (Path A) or `work/reference/document.md` (Path B) — target paper text. Grep into for context on how the cited paper is invoked, when a placeholder's claim is ambiguous.
+- `constitution.md` — Fidelity intent.
+
+## Outputs
+
+- `astra.yaml` — `prior_insights:` placeholders **resolved**: each placeholder's Evidence entries now carry `quote: {exact, prefix, suffix}` (TextQuoteSelector) plus `location: {page: N}` (FragmentSelector, 1-indexed page) pointing at the cited paper. `astra validate astra.yaml --verify-evidence` returns clean.
+- `work/cited/<doi-slug>/` — one directory per cited paper, holding that paper's substrate from paper-extraction (`paper.pdf`, `source/` or `document.md`, `index.json`, `astra.yaml` stub, figures, tables). Resume-by-existence: re-running LITERATURE skips fetching any DOI whose `work/cited/<doi-slug>/` is already populated.
+- `work/notes/literature/resolutions.yaml` — consolidated per-placeholder evidence resolutions before merge (when Haiku fan-out is used, sub-Haiku outputs land in `work/notes/literature/haiku-<N>.yaml` and are merged into this single file). Intermediate; survives for audit.
+
+## How it runs
+
+### Stage 1 — Mechanical fetch (batched, no agent fan-out)
+
+Collect every unresolved `prior_insights:` placeholder — its Evidence carries `doi:` but no `quote:` selector yet. Group those DOIs uniquely; each unique DOI becomes one fetch.
+
+Run paper-extraction's substrate script for each unique DOI **in batches of 5** via shell parallelism. paper-extraction's `extract-paper-substrate.py` is deterministic — no agent involvement needed. Each invocation writes to `work/cited/<doi-slug>/work/reference/`:
+
+```bash
+# Pseudocode for the batched fetch loop an iteration runs.
+# For each unique DOI in the placeholder set:
+mkdir -p work/cited/<doi-slug>
+cd work/cited/<doi-slug>
+python3 /path/to/paper-extraction/scripts/extract-paper-substrate.py \
+    --arxiv-id <id-or-doi>
+# Run up to 5 in parallel with `&` and `wait`; throttle to bound disk + network.
+```
+
+Skip Step 5 (findings) — LITERATURE only needs substrate, not the cited paper's claimed findings. Skip the agent's Step 4 (fix structural gaps) too — cited papers don't need warning-resolution to be quote-grep-able. Cited-paper bibliographies don't need DOI resolution either (we don't care about their citations' DOIs); if paper-extraction supports suppressing that, use it; if not, the cache amortizes across cited papers and it's tolerable.
+
+Wall time: tens of seconds for 20 cited papers; bottlenecked by the slowest single fetch in each batch.
+
+After each fetch lands, **register the PDF with the validator's cache** so `astra validate --verify-evidence` can find it later:
+
+```bash
+astra paper add "<DOI>" --pdf work/cited/<doi-slug>/work/reference/paper.pdf
+```
+
+For arXiv DOIs (`10.48550/arXiv.<id>`) the `--pdf` argument is optional (astra paper add can fetch directly), but pointing at the already-fetched PDF avoids a redundant network hit. For journal DOIs that 403 on Unpaywall, `--pdf` is required.
+
+Resume: if `work/cited/<doi-slug>/work/reference/index.json` already exists, skip that DOI's fetch. If `astra paper get <DOI>` returns a cached entry, skip the registration too.
+
+### Stage 2 — Quote-finding (literature does it, or Haiku fan-out)
+
+Once all substrate is in place, count placeholders:
+
+- **≤10 placeholders:** the iteration does the quote-finding itself. It walks the placeholders one at a time, greps into the relevant cited paper's substrate for terms from the claim, identifies the verbatim quote, and writes `{exact, prefix, suffix, page}` to `work/notes/literature/resolutions.yaml`. Single agent, low context overhead per placeholder (grep + targeted read, not whole-paper-absorption).
+
+- **>10 placeholders:** the iteration partitions placeholders across **a small number of Haiku sub-agents** (rough rule: aim for 5–8 placeholders per Haiku, so 11–15 placeholders → 2 Haikus, 30 placeholders → 4 Haikus). Each Haiku gets its subset of placeholders + the substrate paths for the cited papers those placeholders reference. Haikus are cheap and fast and the work is well-bounded (grep + format YAML), so this is the right model. Each Haiku writes to `work/notes/literature/haiku-<N>.yaml`; the iteration reads them all, merges into `resolutions.yaml`, then writes back to `astra.yaml`.
+
+The exact Haiku threshold and partition size are heuristic — they trade off context-budget per Haiku vs. orchestration overhead. The iteration has discretion; the rule of thumb is "few enough to track easily, each one small enough to finish in a single fast turn."
+
+### Stage 3 — Merge into astra.yaml
+
+The iteration reads `work/notes/literature/resolutions.yaml` and writes the resolutions back into `astra.yaml`:
+
+- For each resolved placeholder, locate `prior_insights[<id>]` in `astra.yaml` (the placeholder already lives in its sub-analysis with `evidence: [{id, doi}]`; the merge augments each Evidence entry with the newly-authored `quote:` + `location:` selectors — `id` and `doi` were already there).
+- For each unresolved placeholder, append a line to `open-questions.md` describing it — the user resolves at REVIEW close-out by either supplying a different citation, weakening the claim, or removing the placeholder entirely.
+- Run `astra validate astra.yaml --verify-evidence` after the merge to catch structural breakage early.
+
+Single writer (the iteration), no merge conflicts even when Haikus produced the inputs in parallel.
+
+## Quote-finding contract (used by both the iteration itself and any Haiku sub-agents the iteration spawns)
+
+The agent doing the quote-finding (literature itself, or each Haiku) follows the same contract. The Haiku prompt is just this contract with concrete placeholders + paths spliced in.
+
+```
+You are an ASTRA evidence-resolution agent. Your task is to find the
+verbatim quotes in cited papers that justify a set of prior_insights:
+placeholders authored by SPECIFY.
+
+Inputs:
+  - A list of placeholders. Each carries:
+      id:             the placeholder's unique id within astra.yaml
+      claim:          what the cited paper supports about a decision
+                      in the target paper (target paper's framing)
+      doi:            DOI of the cited paper (lives on the placeholder's
+                      Evidence entry; quote: needs to be filled in)
+      backed_options: a derived list of "<decision_id>.<option_id>" pairs
+                      that reference this placeholder via Option.insights
+                      — surface from astra.yaml when assembling the
+                      placeholder set so the resolver knows which
+                      decision-options this evidence has to support
+  - Substrate path per cited paper at work/cited/<doi-slug>/work/reference/:
+      paper.pdf, source/*.tex (Path A) or document.md (Path B),
+      index.json (structural index for that cited paper).
+  - Target paper at work/reference/source/ or work/reference/document.md
+    (for context on how the cited paper is invoked, if you need it).
+
+For each placeholder:
+
+  1. Grep into the cited paper's substrate for terms from the claim.
+     Path A: grep across work/cited/<doi-slug>/work/reference/source/*.tex.
+     Path B: grep work/cited/<doi-slug>/work/reference/document.md.
+
+  2. Read targeted spans (offset/limit) around the matches. Find a
+     verbatim passage that supports the claim. Focus on:
+       - Empirical comparisons between the approaches the placeholder's
+         backed_options reference.
+       - Performance benchmarks or validation results relevant to the
+         choices.
+       - Recommendations or caveats about specific methods/parameters.
+
+  3. Build a TextQuoteSelector (exact + prefix + suffix) and
+     FragmentSelector (page).
+       - exact: copied VERBATIM from the source. Don't paraphrase or
+         normalize whitespace. Don't quote math-heavy passages (the PDF
+         text extractor collapses them); quote the surrounding English
+         narrative instead.
+       - prefix / suffix: 20–100 chars of REAL surrounding text, NOT
+         editorial parentheticals. The validator concatenates them with
+         the quote and matches against the PDF page at score ≥ 80.
+       - page: page number from the rendered PDF where the quote
+         appears.
+
+  4. If no quote in the cited paper supports the claim, record the
+     placeholder under unresolved: with a brief reason. The citation
+     was loose, or the paper was paraphrased beyond what the source
+     says, or the wrong paper was cited. Don't fabricate evidence.
+
+Output (YAML, written to the path you were assigned):
+
+resolutions:
+  <insight_id>:
+    id: <insight_id>
+    evidence:
+      - id: ev1
+        doi: "<DOI>"
+        quote:
+          type: TextQuoteSelector
+          exact: "<verbatim quote>"
+          prefix: "<~20-100 chars REAL surrounding text BEFORE>"
+          suffix: "<~20-100 chars REAL surrounding text AFTER>"
+        location:
+          type: FragmentSelector
+          page: <int>
+
+unresolved:
+  <insight_id>:
+    reason: "<one-line>"
+
+Rules:
+  - Keys under resolutions: / unresolved: are placeholder ids from
+    astra.yaml; preserve them exactly. Merge uses these as the join key.
+  - One placeholder lands in either resolutions: or unresolved:, never both.
+  - Quotes are EXACT — verbatim, no paraphrasing, no whitespace normalization.
+  - prefix: and suffix: are REQUIRED.
+  - Avoid YAML | block-literal style for these strings; single-line or > folded.
+  - Do NOT edit astra.yaml. The merge step does that.
+```
+
+When the iteration fans out to Haikus, each Haiku is spawned with `model="haiku"` and gets this contract plus its assigned subset of placeholders and substrate paths.
+
+## Reviewing prior LITERATURE work as part of survey
+
+There is no separate review phase. Every iteration that enters and finds `prior_insights:` placeholders resolved on disk reads them critically — running `astra validate --verify-evidence` for the deterministic check, plus a semantic re-read of each insight. If you see real issues — tangential quote, wrong cited paper, broken `Option.insights` linkage — fix them inline, commit (`literature: fix <what>`), exit. When a fresh-context read finds nothing to fix, the iteration advances to IMPLEMENT.
+
+The cross-check questions on entry:
+
+1. **Evidence integrity.** `astra validate --verify-evidence` handles the deterministic check; do the semantic check yourself.
+2. **Evidence justifies claim.** Does the quote actually support the claim, or is it tangential?
+3. **Claim supports the decision.** Does the placeholder's claim justify the decision option that references it via `Option.insights`?
+4. **Cited paper is the right paper.** Does the target paper actually invoke this DOI for this claim?
+5. **Unresolved entries are honest.** For entries in `open-questions.md` flagged unresolved, does a closer read of the cited paper find supporting evidence the resolver missed?
+
+Apply fixes inline as you find them — `astra.yaml`'s `prior_insights:` entries (including re-running Haiku quote-finding for entries that need a different quote, when the gap is mechanical rather than semantic). Commit the diff and exit.
+
+If the entry genuinely has no supporting quote in the cited paper, log it to `open-questions.md` with a "no support found" note and leave the entry as-is for the user to resolve at REVIEW. Don't fabricate evidence.
+
+## Survey signals (entry into LITERATURE)
+
+- `astra.yaml` has `prior_insights:` placeholders — entries with `claim:` plus Evidence carrying `doi:` but no `quote:` selector ⇒ ready to resolve
+- `work/cited/<doi-slug>/work/reference/index.json` exists for each unique cited DOI ⇒ fetches done
+- `work/notes/literature/resolutions.yaml` exists with non-empty resolutions / unresolved sections ⇒ quote-finding done
+- `astra.yaml`'s `prior_insights:` entries each have a resolved `quote:` (+ `location:`) selector on their Evidence ⇒ merge done
+- `astra validate astra.yaml --verify-evidence` returns clean ⇒ structural validation done; read the resolutions critically. Fix anything wrong; otherwise the iteration advances to IMPLEMENT.
+
+## Notes
+
+- **Mechanical fetch is the substrate; quote-finding is the agentic work.** Don't conflate them. paper-extraction's deterministic script handles the fetch — batched-parallel via shell, no agent fan-out. Quote-finding is the semantic match between target-paper-claim and cited-paper-quote; that's the agent's job.
+- **paper-extraction is the canonical fetch mechanism.** Using `astra paper add` would give only the cached PDF; paper-extraction gives substrate (LaTeX source where available, structural index, figures, citations) which is much better material for verbatim quote-finding. The cost is small and parallelizable.
+- **Haiku is the right model for fan-out quote-finding.** Cheap, fast, well-suited to bounded grep-and-format work. Use Sonnet/Opus only when the placeholder count is small enough that the iteration does the quote-finding itself anyway.
+- **Resume is automatic.** If `work/cited/<doi-slug>/work/reference/index.json` exists, skip that DOI's fetch. If `work/notes/literature/resolutions.yaml` has an entry for a placeholder, skip that placeholder's quote-finding.
+- **Unresolved is not failure.** A placeholder that no quote in the cited paper supports is a real signal — the target paper cited loosely or paraphrased beyond what the source actually says. Surface to `open-questions.md`; don't fabricate evidence.
+- **`astra validate --verify-evidence` runs after the merge**, not after each Haiku's per-placeholder output. Haikus write to disjoint files; the deterministic check happens once `astra.yaml` is updated.
+- **Commit per stage.** Fetches commit together once Stage 1 completes (one commit for all cited-paper substrates). Quote-finding commits together once Stage 2 completes (`resolutions.yaml` + Haiku files). The merge into `astra.yaml` is its own commit. Subsequent fix passes commit separately. The next iteration reads `git log` to see progress.
diff --git a/claude/lightcone/skills/lc-from-paper/references/orient.md b/claude/lightcone/skills/lc-from-paper/references/orient.md
new file mode 100644
index 00000000..0c6db49c
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/orient.md
@@ -0,0 +1,235 @@
+# ORIENT — Phase 0
+
+The opening pre-loop phase. Runs in the user's main session, before the ralph loop launches. Its job is to figure out what the user wants to reproduce, stand up the reference substrate (paper + code), and write the per-paper `constitution.md` + `CLAUDE.md` the ralph loop's iterations will walk up to.
+
+One phase, executed in stages so each later decision is grounded in what was acquired earlier. The paper is read before the interview questions land (so questions reference actual figures and claims); the code is scanned before the constitution is drafted (so the constitution's Scope and sub-analysis decomposition lean on the actual pipeline). The user reviews the drafts before anything commits.
+
+ORIENT is the only pre-loop bookend. REVIEW is the post-loop one. Everything else lives inside the ralph loop.
+
+---
+
+## What ORIENT produces
+
+Three things in the reproduction workdir, all committed together at the end:
+
+- **`constitution.md`** — drafted from [`../templates/constitution.md`](../templates/constitution.md). YAML frontmatter `status: active`, then Goal, Fidelity intent, Scope (in / out), Quality bar, Evidence (paper DOI, arXiv ID, code repo URL, where the substrate lives), Open dimensions. The ralph loop's driving document; each iteration reads it on entry. The body sharpens slowly; Open dimensions is updated each iteration as decisions worth user ratification surface. Task-bound — archivable once the reproduction closes.
+- **`CLAUDE.md`** — drafted from [`../templates/CLAUDE.md`](../templates/CLAUDE.md). Paper identity at the top (DOI, title, one-line subject), Rules (universal across reproductions; leave the template's defaults), Disagreements log (starts empty; iterations append), Open opportunities (starts empty; iterations append), Pointers (to `constitution.md`, `work/reference/`, etc.). The auto-loading walk-up; every Claude Code session in the workdir picks it up. Durable — stays useful for any follow-on work in this directory once the reproduction lands.
+- **`work/reference/` substrate** — paper substrate from `/paper-extraction` (`paper.pdf`, `source/` or `document.md`, `index.json`, `astra.yaml`, `figures/`, `tables/`, `bibliography-source.{bib,bbl}`) + code substrate from `/lc-from-code` scan-only (`code/`, `code-status.yaml`, `code-index.md`) when there's a reference code repo.
+
+There is no separate "constitution skill" invocation — `/ralph`'s Authoring mode (Study → Draft → Refine → Launch) is what you're following here; the constitution authoring discipline + reference materials live there. Pull the discipline mentally; the deliverable is these two markdown files (plus the substrate produced by the inline skill invocations).
+
+After the user approves both drafts, save them, `git init` the workdir if it isn't one already, commit `constitution.md` + `CLAUDE.md` + the full `work/reference/` substrate as the first commit, then launch the ralph loop (per SKILL.md's *Launching the loop* section).
+
+---
+
+## The stages
+
+### Stage 1 — Ask for the paper
+
+Ask the user for the paper identifier in **prose** — not `AskUserQuestion`. The answer is inherently free-form (an arXiv ID, a DOI, or a path to a PDF on disk), and a multiple-choice modal is the wrong shape for it.
+
+Wording is up to you, but cover the three forms cleanly. Something like:
+
+> *"What paper would you like to reproduce? An arXiv ID, a DOI, or a path to a PDF on disk all work — arXiv ID gives the cleanest acquisition because the LaTeX source comes through."*
+
+If the user supplied the identifier on the `/lc-from-paper` invocation, skip the ask. **No `AskUserQuestion` runs before paper-extraction has landed** — anything beyond the identifier is either inferable from the paper or belongs in a later stage where you can ground the question.
+
+### Stage 2 — Run `/paper-extraction` inline; read the substrate
+
+With the paper identifier in hand, invoke the paper-extraction skill directly:
+
+```
+/paper-extraction <doi-or-arxiv-id-or-pdf-path>
+```
+
+This produces the paper substrate under `work/reference/`. When it returns, the substrate is on disk. **Read it before continuing to Stage 3** so the next questions are grounded:
+
+- **`work/reference/index.json`** — title, abstract, figure/table inventory with captions, section outline, citations with resolved DOIs. The structural surface.
+- **The abstract and the conclusions section of the paper** — give you the claimed headline results, with actual numbers.
+- **The "Data availability" / "Code availability" sections of the paper** — usually the canonical place for repo URLs and dataset locations. If neither section exists, grep across `work/reference/source/*.tex` (Path A) or `work/reference/document.md` (Path B) for `github.com`, `gitlab`, `zenodo`, `softwarex`, `\url{}` patterns.
+- **The acknowledgements section** — sometimes carries software repos, dataset attributions, cluster acknowledgements that hint at the execution environment.
+
+You do *not* need to read the paper end-to-end. The goal is to ground Stage 3's questions — abstract for claims, conclusions for what the paper says it found, data/code availability for substrate hints. Iterations will read the rest as they need it.
+
+If `/paper-extraction` fails or returns partial substrate (network issue, ambiguous arXiv ID, etc.), surface the failure to the user before continuing.
+
+### Stage 3 — Interview the user, grounded in the paper
+
+Now `AskUserQuestion` is the right tool — each remaining question is a constrained choice with structured options, and the user has paper context loaded from your summary or from the substrate they can browse. Ask in whatever order reads naturally; batching related questions in a single `AskUserQuestion` call (up to 4) is fine.
+
+#### Scope
+
+Present the paper's actual primary outputs as a menu:
+
+> *"The paper claims [N] figures + [M] tables + [headline numerical results]. What's in scope for this reproduction?"*
+>
+> - Full — every primary result the paper reports
+> - Targeted — specific figures / tables / numbers (you'll list which)
+> - Use the paper's natural primary-result set (default)
+
+When the user picks "targeted," follow up with the list of the paper's figures/tables (from `index.json`) so they can pick the subset directly rather than recalling from memory.
+
+If the paper has sub-analyses with genuinely independent stages (e.g. reconstruction → clustering → BAO fit), ask about decomposition; if the paper is monolithic, one analysis suffices.
+
+These answers go into `constitution.md`'s **Scope** section (in / out) and inform ARCHITECT's structural decomposition.
+
+#### Fidelity intent
+
+A reproduction can land anywhere from a quick "does this even run" sanity check to a full match across every primary and secondary target. The user owns where they want this one to land — but where it *can* land in this stretch depends on the compute, tokens, time, and attention available. The honest meta-conversation is the point: what does the user want out of this first stretch, given what's spendable on it?
+
+Don't ask the abstract "what would you like to get out of this" — too literal, lands as a wish list. Pivot on what's actually being weighed. With the paper's actual headline numbers in hand from the abstract/conclusions, name them in the prompt so the answer can lock onto something concrete:
+
+> *"The paper's headline is `S_8 = 0.795 ± 0.014`. What's the right shape for this stretch — a quick check that the analysis is tractable, getting that one number right within stated uncertainty, or a full match across every primary target? How much compute and wall-clock do you have to spend on it?"*
+
+Offer the prose options as `AskUserQuestion` options the user can pick from or replace via "Other":
+
+- *"Just checking the analysis is tractable — quick sanity that some headline number comes out close. An afternoon."*
+- *"The headline matches within stated uncertainty; secondary results can stay rough. Overnight."*
+- *"One specific figure / result fully matches; rest stay rough — a day or two."* — follow up: which one?
+- *"Every primary and secondary target lining up within stated tolerance; every paper-vs-code conflict adjudicated. No hard deadline."*
+
+Record the answer verbatim or in close paraphrase under **Fidelity intent** in `constitution.md`'s Goal section. Time/compute bounds are part of the intent — the user's spendable budget shapes what "good enough" can mean for this stretch. Each iteration reads the intent when sizing its next move; COMPARE grades opportunities against it.
+
+If the user genuinely doesn't know yet, write that — *"Not sure yet; let's get something running and revisit"* is itself useful intent, and they can sharpen it at any future REVIEW.
+
+#### Code repository
+
+Use what `/paper-extraction` surfaced. If there's a single candidate URL from the data/code availability or acknowledgements section, lead with that confirmation:
+
+> *"The paper's Data availability section points at `https://github.com/...`. Should we clone that as the reference code? Or is there a different/private repo?"*
+
+If paper-extraction found nothing, ask plainly:
+
+> *"I didn't find a code repo URL in the paper. Is there a private / unpublished repo we should clone? Or proceed paper-only?"*
+
+When the user provides a URL, capture it. When the paper has no code repo and the user doesn't supply one, note *"no public code; paper prose is the only methodological anchor"* and skip directly to Stage 6 (no code substrate to acquire). When the code is available, every iteration that touches a sub-analysis reads from `work/reference/code/` and treats code as canonical for numerics + method — this is recorded in `CLAUDE.md`'s Rules.
+
+#### Paper-specific conventions or warnings
+
+Now Claude has read the paper enough to *propose* one-line conventions / warnings rather than asking the user to volunteer cold. Surface candidates from your post-extraction read:
+
+> *"From the paper I noticed: (a) Paper II of a 5-paper series; siblings in prep with no DOI. (b) Uses non-standard convention for X. (c) Four-way catalog comparison drives every figure. Want any of those as iteration-level pointers in `CLAUDE.md`?"*
+
+Let the user toggle the ones to keep, edit them, add more, or skip cleanly if none apply. The selected items land in `CLAUDE.md`'s **Pointers** section as one-line notes — context every iteration sees on entry.
+
+#### Prior familiarity
+
+A single question:
+
+> *"How familiar are you with this paper?"*
+>
+> - Haven't read it / barely skimmed
+> - Skimmed it / general sense of the claims
+> - Read carefully / know the methodology
+> - Author / worked closely with the authors
+
+This affects how confidently iterations should defer to the user when adjudicating paper-vs-code disagreements, and how heavy first-iteration review should lean.
+
+#### External context
+
+The real probe is: *"is there context outside the paper substrate + codebase that should inform the spec?"* — co-author feedback, sibling-paper drafts (common for papers in a series), internal blinding documentation, decision-history docs, referee responses, a relevant talk or slide deck. The artifact form varies; what matters is whether such context exists and whether you should point ARCHITECT at it.
+
+Ask in those terms:
+
+> *"Beyond the paper and any code repo, is there context an iteration should know about — co-author / referee feedback, internal notes, a sibling paper still in prep, decisions documented elsewhere? If yes, point at the path(s). Otherwise the paper substrate + code are the source of truth."*
+
+Capture paths into `CLAUDE.md`'s **Pointers** section. Don't proactively read them in ORIENT — that's ARCHITECT's job when it scopes the sub-analyses.
+
+### Stage 4 — Clone the code (if any) and run `/lc-from-code` scan-only
+
+Skip cleanly when Stage 3's code-repo answer was "no public code." Otherwise:
+
+1. **Clone the repo:**
+   ```bash
+   git clone --depth 1 <url> work/reference/code
+   ```
+   For multi-project monorepos where the user pointed at specific subpaths (e.g. GitHub `tree/<branch>/<path>` URLs), clone the whole repo on the named branch — don't sparse-checkout — and capture the primary subpaths in `code-status.yaml` so `/lc-from-code` knows where to focus.
+
+2. **Write `work/reference/code-status.yaml`:**
+   ```yaml
+   found: true        # or false
+   url: "https://..."  # null if not found
+   branch: "main"     # or whichever branch was cloned; null if not found
+   cloned: true       # false if found but clone failed
+   primary_subpaths:  # optional; for multi-project monorepos
+     - "notebooks/..."
+   notes: "..."
+   ```
+
+3. **Invoke `/lc-from-code` in scan-only mode:**
+   ```
+   /lc-from-code scan-only against work/reference/code/. From inside /lc-from-paper's ORIENT phase. Produce work/reference/code-index.md only — do not touch the project-root astra.yaml, do not parameterize any code, do not run anything, do not modify the cloned repo. Primary subpaths (per code-status.yaml): <list>.
+   ```
+
+   The scan-only branch of `/lc-from-code` does the inventory pass and writes to `work/reference/code-index.md`. Its prompt-context surface carries the "stop at scan" contract.
+
+When no public code repo exists, write `code-status.yaml` with `found: false` and skip `/lc-from-code` entirely. The code-as-canonical rule self-disables in that case.
+
+### Stage 5 — Follow-up questions if the code surfaced anything new
+
+If the code-index reveals something the user should weigh in on — an unexpected dependency, a clear pipeline boundary that suggests a sub-analysis decomposition different from the paper's, an unusual container requirement, an explicit data-availability gate not visible in the paper — ask before drafting the constitution.
+
+Usually this is light or skipped entirely. The code-index is the iterations' surface, not the user's; most of what it reveals doesn't need user adjudication at ORIENT. But when something genuinely affects scope or constitution shape, surface it now rather than waiting for an iteration to file an open question.
+
+### Stage 6 — Draft `constitution.md` + `CLAUDE.md`
+
+Open both templates side-by-side:
+
+- [`../templates/constitution.md`](../templates/constitution.md) — fill in the header, Goal (with fidelity intent), Scope (in / out), Quality bar, Evidence (paper DOI, arXiv ID, code repo URL — these are the user-supplied identifiers; the substrate-path bullets in the template stay as boilerplate, naming where each substrate lives on disk), Open dimensions. Leave the YAML frontmatter `status: active` intact. Both paper and code substrate are on disk by now — the constitution can lean on the actual pipeline decomposition, named figures/tables, and concrete file paths.
+- [`../templates/CLAUDE.md`](../templates/CLAUDE.md) — fill in the header (paper title + arXiv ID + DOI + one-line subject), any paper-specific Pointers from Stage 3. Leave Rules in the template state (universal across reproductions). Leave the Disagreements log and Open opportunities sections empty — iterations populate them.
+
+### Stage 7 — User review, refine, commit, launch
+
+**Halt here for explicit user approval.** This is the user's only review point before the autonomous loop takes over; treat it as the final author-mode editorial pass. Do not commit or launch the ralph loop until the user explicitly confirms — silence is not approval.
+
+1. **Show the drafts.** Point the user at `constitution.md` and `CLAUDE.md` (file paths plus a brief inline summary of what each carries — Goal / Fidelity intent / Scope / Quality bar / Evidence for the constitution; paper header + Pointers for the CLAUDE.md). The user reads the actual files; don't paste the full bodies inline.
+
+2. **Surface any open questions you have at this gate.** If a paper detail is ambiguous, a scope choice didn't fully resolve in Stages 3–5, a sub-analysis decomposition is uncertain, or a fidelity intent is implicit but not pinned — ask now, in this same exchange, *before* the loop launches. Each ralph iteration runs cold from `constitution.md` + `CLAUDE.md`; an open question held back here is much harder to raise later.
+
+3. **Gate on `AskUserQuestion`.** Offer options like "Looks good — commit and launch", "I want to edit first" (point them at the file paths), "I have feedback" (collect, refine, re-show, gate again). The launch decision waits on this answer.
+
+4. **When the user approves:**
+   - `git init` the workdir if it isn't one already (per SKILL.md's *Setup: git-tracked workdir* discipline).
+   - Commit `constitution.md` + `CLAUDE.md` + the full `work/reference/` substrate (paper + code, when code present) as the first commit. A single commit captures the full ORIENT deliverable.
+   - The `work/reference/code/` clone itself can be `.gitignore`d for large monorepos; `code-index.md` is what downstream iterations actually consult. The clone is reproducible from `code-status.yaml`'s URL.
+   - Launch the ralph loop per SKILL.md's *Launching the loop* section.
+
+Tell the user the tmux session name and the attach command, and that you'll be ready for REVIEW close-out when the loop terminates.
+
+---
+
+## Discipline
+
+- **No `AskUserQuestion` before paper-extraction has run.** Stage 1 collects the identifier in prose; everything else waits until Stage 3, after the paper is on disk and you can ground the questions in actual content.
+- **The paper-identifier question is prose.** It's the one question that doesn't fit `AskUserQuestion`'s multiple-choice shape; the free-form answer (arXiv ID / DOI / PDF path) belongs in a prose ask.
+- **Three to six `AskUserQuestion` rounds total across Stages 3 + 5** — scope, fidelity, code repo, conventions, familiarity, external context, plus any Stage 5 follow-ups. Some can batch into a single multi-question call when they're independent.
+- **One commit at the end, with everything.** `constitution.md` + `CLAUDE.md` + paper substrate + code substrate are committed together. No intermediate commits for "paper-extraction landed but the user hasn't approved yet" or "code cloned but constitution not drafted yet."
+- **Defaults are the path.** When the user says "you choose," take the defaults — full reproduction, the paper's natural sub-analysis structure if any. The defaults reflect what the architecture has learned about which seams matter.
+- **One paper at a time.** A single `constitution.md` + `CLAUDE.md` pair covers one paper. If the user wants two, run ORIENT twice — two reproduction directories, two pairs.
+- **No code repo is still a valid ORIENT outcome.** When `code-status.yaml` records `found: false`, iterations operate in paper-only mode — methodology lives in the paper's prose; no code-as-canonical adjudication is needed. CLAUDE.md's code-as-canonical Rule self-disables.
+
+---
+
+## When ORIENT gets stuck
+
+Most failure modes resolve into "the user has not yet decided what 'reproduce' means for them." If the conversation is circling, ask one of these directly:
+
+- *"If we ran this and it produced figure 3 plus the headline number in Table 2, would you be done?"* — pins targeted vs full.
+- *"Is there a specific decision in the paper you want to vary, or are we trying to match the paper exactly?"* — pins whether universes need to span alternatives.
+- *"What's the moment you'd call this useful — any number coming out, a specific figure matching in shape, the headline matching within stated uncertainty, or every target lining up?"* — pins fidelity intent.
+- *"Are you trying to verify the paper, build on it, or critique it?"* — shifts where the fidelity bar naturally sits.
+- *"Is there anything weird about this paper you want every iteration to know up front?"* — pins paper-specific conventions.
+
+When these answer cleanly, both files draft themselves.
+
+---
+
+## Survey signals (entry into ORIENT)
+
+If the user is walking into a workdir mid-flow, check what's already on disk before re-running stages:
+
+- `constitution.md` + `CLAUDE.md` at workdir root, committed → ORIENT already produced its files. If the loop didn't launch (or has exited), skip ahead to launching.
+- `work/reference/{paper.pdf, source/ or document.md, index.json, astra.yaml}` present → paper substrate from Stage 2 exists. `/paper-extraction` is idempotent — re-invoke if anything looks partial; it skips done work.
+- `work/reference/code/` present **or** `code-status.yaml` records `found: false` **and** `code-index.md` present → code substrate from Stage 4 exists.
+
+When all three are committed, ORIENT is done. Otherwise, identify the earliest missing piece and resume from there.
diff --git a/claude/lightcone/skills/lc-from-paper/references/review.md b/claude/lightcone/skills/lc-from-paper/references/review.md
new file mode 100644
index 00000000..5a080bf7
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/review.md
@@ -0,0 +1,108 @@
+# REVIEW — close-out in the user's main session
+
+The reproduction has converged: the constitution's `status:` is `closed` (after COMPARE returned `pass`, or `partial` with the un-acted opportunities logged, and the next cold-survey iteration found nothing left to do). The ralph loop's tmux session has exited. REVIEW runs back in the user's main session — the second of two interactive bookends, the first being ORIENT. It runs in the user's main session (not as an iteration) because both `/figure-comparison` and `/check-sentence-by-sentence` use `AskUserQuestion`, which isn't available inside detached ralph iterations.
+
+Its job is to render the validation surfaces, walk the user through the accumulated open questions, land the resolutions, and draft the final report — in one interactive arc. The Open opportunities list in CLAUDE.md already carries un-acted-on opportunities from the latest COMPARE (those iterations logged them directly); REVIEW just reads them.
+
+The phase name **REVIEW** is freed by the old pre-implement REVIEW phase folding into ARCHITECT, SPECIFY, LITERATURE, and IMPLEMENT as their per-iteration self-review passes. This close-out is what the previous shape called SUMMARIZE_RUN.
+
+## Inputs
+
+- `astra.yaml` — final spec (validates with `--verify-evidence` once LITERATURE has resolved every `prior_insights:` placeholder's Evidence `quote:` selector)
+- `comparison-report.yaml`, `comparison-report.md` — final verdict + opportunity assessment
+- `targets/targets.md` — what was being matched against; reference figures / tables in `targets/`
+- `results/<universe>/<output_id>/` — reproduced figures / tables / metrics
+- `open-questions.md` at the workdir root — running report from the iteration-phases (paper-vs-code conflicts, ambiguities, anything iterations flagged for user resolution)
+- `work/reference/index.json` and `work/reference/code-index.md` — for context
+- `work/reference/source/` (Path A) or `work/reference/document.md` (Path B) and `work/reference/code/` — directly available for follow-up questions the user asks during REVIEW that the report and CLAUDE.md don't answer ("remind me what the paper says about X", "did the original code do Y"). Grep into for specifics; read targeted spans by offset/limit.
+- `CLAUDE.md` at the workdir root — paper identity, Rules, Paper-vs-code disagreements, Open opportunities (the durable surface, accumulated across iterations)
+- `constitution.md` at the workdir root — Goal, Fidelity intent, Scope, Quality bar, Evidence, Open dimensions (the driving document the loop has been working against)
+
+## Outputs
+
+- `.lightcone/comparison.html` — `/figure-comparison`'s portable side-by-side report (paper artifacts vs reproduced)
+- (Optional) `.lightcone/check-sentence-by-sentence.md` — `/check-sentence-by-sentence`'s claim audit (file:line or NOT FOUND per sentence)
+- `open-questions.md` — same file, but with `## Resolutions` section appended capturing what the user said for each entry
+- Edits to `astra.yaml` / `implementation-notes.md` / `universes/baseline.yaml` if any open-question resolution warrants a spec change
+- `REPRODUCTION-SUMMARY.md` — final report; concise (~1–2 pages); the canonical record of what the reproduction landed on
+- CLAUDE.md updates — **Paper-vs-code disagreements** entries reconciled with their resolutions (Open opportunities already there from COMPARE iterations)
+- A commit closing out the reproduction
+
+## Step 1: render the validation surfaces
+
+### `/figure-comparison` (mandatory)
+
+Invoke the `/figure-comparison` skill from the user's main session. It builds a portable HTML side-by-side comparing paper artifacts (from `targets/`) to reproduced artifacts (from `results/<universe>/`). The skill uses `AskUserQuestion` for any inputs it can't infer from the workdir; that works because REVIEW runs back in the user's main session — the prompts land here, not in a detached iteration.
+
+Output lands at `.lightcone/comparison.html`. Show the user the path and offer to open it (`open` on macOS, `xdg-open` on Linux, or just print the path so they click in their terminal).
+
+**Do not spawn `/figure-comparison` under the `Task` tool or inside a ralph iteration.** It has `AskUserQuestion` in its `allowed-tools`; sub-agents and detached iterations have no user-reach, so the prompt fires into nothing.
+
+### `/check-sentence-by-sentence` (opt-in)
+
+Ask the user via `AskUserQuestion` whether they want the claim audit. It's optional because for many reproductions the figure-comparison already settles "did it match?"; the sentence-by-sentence audit earns its keep when the paper makes many specific quantitative claims and the user wants each one anchored to a code location.
+
+If yes, invoke `/check-sentence-by-sentence`. Same discipline as `/figure-comparison` — it can prompt the user; do not spawn under `Task` or inside a ralph iteration.
+
+Output lands at `.lightcone/check-sentence-by-sentence.md` (or wherever the skill writes it). Show the user the path.
+
+## Step 2: walk `open-questions.md` with the user
+
+Read `open-questions.md` at the workdir root. For each unresolved entry, surface it via `AskUserQuestion` with:
+
+- **The question** (verbatim from the file)
+- **Origin** — which phase flagged it
+- **The default the phase applied** (if any — e.g. "code as canonical")
+- **Three options**: ratify the default, override (user spells out their choice), or defer (leave as a known limitation in the final report)
+
+Append a `## Resolutions` section to `open-questions.md` capturing what the user said for each entry. This makes the resolution durable — re-runs and future sessions see it. Cross-reference with CLAUDE.md's **Paper-vs-code disagreements** section: every entry there should now have its resolution recorded, either inline (if the user picked the canonical default) or in `open-questions.md`.
+
+If a resolution warrants a spec change (the user picks an override), edit `astra.yaml` / `implementation-notes.md` / `universes/baseline.yaml` accordingly and re-run `astra validate astra.yaml`. If the change would invalidate the comparison report (e.g. flips the canonical method for a primary output), surface that to the user — in most cases the reproduction is "done" and the override is a known limitation, but the user may choose to re-open the loop for another IMPLEMENT pass.
+
+## Step 3: write `REPRODUCTION-SUMMARY.md`
+
+A single markdown file at the project root, ~1–2 pages. The canonical record of what this reproduction landed on. Sections:
+
+1. **What was reproduced** — the paper, the scope, the targets.
+2. **Verdict** — pass / partial. If partial, what failed and why we accepted it.
+3. **Material decisions** — the paper-vs-code conflicts SPECIFY's code pass (and any IMPLEMENT pass) surfaced, what the user chose (in prose ratification or by canonical-resolution default), and why.
+4. **Outputs** — pointers to the figures / tables / metrics produced. One bullet per primary target with the path to the reproduced result and a one-line match note from the comparison report.
+5. **Open opportunities** — pull from CLAUDE.md's *Open opportunities* list (already carries un-acted-on opportunities from the latest COMPARE), plus anything fresh in `comparison-report.yaml`'s `opportunities:` block not yet reflected there. One bullet each with the leverage assessment. This is what a future session (or a future-Cail revisiting) would tighten next.
+6. **What was learned** — anything the reproduction surfaced that wasn't visible from the paper alone (a parameter the code uses but the paper doesn't mention, a data cut stricter than stated, etc.). The reproduction's value to the broader literature.
+7. **Resolved open questions** — pull from `open-questions.md`'s `## Resolutions` section. One bullet per question + its resolution.
+8. **Re-running** — one paragraph: how to re-run from this workdir (`lc run --universe baseline`, the relevant `astra.yaml`, where CLAUDE.md lives so future Claude Code sessions auto-load it on walk-up).
+
+Brief, not exhaustive. The depth lives in `astra.yaml` and the workdir's notes; the summary is the door into them.
+
+## Step 4: reconcile the Open opportunities list
+
+COMPARE iterations have been logging un-acted-on opportunities into CLAUDE.md's *Open opportunities* list as they run, so the list is already populated. REVIEW's job here is reconciliation: cross-check that every opportunity in `comparison-report.yaml`'s `opportunities:` block that the user did NOT act on is present in CLAUDE.md's list, and remove any that the user acted on at REVIEW (e.g. authorized one more IMPLEMENT round to close).
+
+## Step 5: commit
+
+Stage `REPRODUCTION-SUMMARY.md`, `open-questions.md` (with resolutions), the updated CLAUDE.md, the final `astra.yaml`, the comparison artifacts, and any housekeeping changes. Commit with a message that names the verdict and the close-out:
+
+```
+review: <paper-short-name> verdict <verdict>, summary at REPRODUCTION-SUMMARY.md
+```
+
+This commit is the durable mark that the reproduction has reached close-out. Future walk-ups read CLAUDE.md and `git log` to know where the reproduction stands; the close-out commit + REPRODUCTION-SUMMARY.md together stand in for the old constitution `outcome:` field.
+
+## Survey signals (entry into REVIEW)
+
+- `comparison-report.yaml` verdict is `pass` (or `partial` with un-acted opportunities logged) ⇒ ready to close out
+- `.lightcone/comparison.html` exists ⇒ `/figure-comparison` rendered
+- `open-questions.md` has a `## Resolutions` section covering every entry ⇒ open-questions walkthrough done
+- `REPRODUCTION-SUMMARY.md` exists ⇒ final report written
+- CLAUDE.md's *Open opportunities* list reflects the un-acted-on opportunities from the latest COMPARE ⇒ reconciliation done
+- A `review:` commit lands ⇒ REVIEW done; reproduction complete
+
+## Notes
+
+- **This phase runs in the user's main session.** Do not invoke it from inside a ralph iteration. The whole point of REVIEW is that the user is reachable — every step uses `AskUserQuestion` (directly, or via the sibling skills it invokes), and iterations are detached.
+- **`/figure-comparison` and `/check-sentence-by-sentence` use `AskUserQuestion`.** That's why REVIEW runs in the user's main session and they live here, not in any iteration. Invoking either inside an iteration fires prompts into nothing.
+- **The user owns the verdict-acceptance decision.** REVIEW's purpose is to let the user see what the loop's iterations did and decide whether they accept it. The skill renders surfaces and asks; it does not unilaterally close.
+- **Don't confuse with the per-phase reviews inside the loop.** ARCHITECT, SPECIFY, LITERATURE, and IMPLEMENT each have their own fresh-context review discipline that happens by iteration boundary. Those are unrelated to this close-out — same word, different jobs. The phase boundary makes them unambiguous: per-phase reviews live inside their host phase's reference; this one is the post-loop close-out in the user's main session.
+- **Open-question resolutions are durable.** Append to `open-questions.md`'s `## Resolutions` section so the next re-run / future session sees what was decided. Do not delete the original questions.
+- **Keep the report short.** Long reports get skimmed; short reports get read. Two pages is generous.
+- **Do not invent further work.** If the user has accepted the verdict and the opportunities are propagated, the reproduction is done. The next session, the user, or a future revisit can decide whether tightening any open opportunity still serves them.
diff --git a/claude/lightcone/skills/lc-from-paper/references/run.md b/claude/lightcone/skills/lc-from-paper/references/run.md
new file mode 100644
index 00000000..bbc1bba0
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/run.md
@@ -0,0 +1,57 @@
+# RUN — execute the recipes
+
+Materialize every output in `astra.yaml` for the requested universe. RUN is mostly mechanical — `lc run --universe <id>` does the heavy lifting. The phase exists as a discrete step so failures get diagnosed and re-run before COMPARE.
+
+RUN is what a ralph iteration does when the workdir signals "recipes present in `astra.yaml` + `scripts/` committed + `results/<universe>/<output>/` absent for any output." The iteration runs the recipes, diagnoses failures, attempts targeted fixes, and exits. Universe defaults to `baseline`.
+
+## Inputs
+
+- `astra.yaml` with recipes (from IMPLEMENT)
+- `universes/<universe_id>.yaml` — defaults to `baseline`
+
+## Outputs
+
+- `results/<universe_id>/<output_id>/` for every output declared in `astra.yaml`
+
+## Task
+
+Execute all recipes:
+
+```bash
+lc run --universe baseline
+```
+
+(Universe defaults to `baseline`; iterations override if the constitution scopes a different universe.)
+
+Check status:
+
+```bash
+lc status --universe baseline
+```
+
+Status states are `ok` (materialized), `pending` (has recipe, not run), `no_recipe` (declared, no recipe — bug). Every output declared in `astra.yaml` must reach `ok`.
+
+If outputs fail:
+
+1. **Read the script's error.** `results/<universe>/<output>/.log` (or wherever the runner emits stderr) usually has the message.
+2. **Diagnose.** Common failures: missing data dependency (a referenced URL changed; the data archive moved), missing Python package (`requirements.txt` was incomplete), spec / script mismatch (the recipe's `inputs:` does not match what the script reads).
+3. **Fix.** Edit the script or `requirements.txt` or the spec, whichever applies.
+4. **Re-run.** `lc run --universe baseline` resumes from where things failed; it does not re-execute already-materialized outputs.
+5. **Repeat** until all outputs are `ok`.
+
+## Rules
+
+- **Always use `lc run`** — do not run scripts directly. The runner manages dependencies, environments, and artifact paths; bypassing it produces inconsistent results.
+- **Re-runs are idempotent.** `lc run` skips outputs that are already materialized. To force re-execution, the runner has a flag for that — check `lc run --help`.
+- **Failures stay failures until fixed.** Do not "move on" past a failed output by editing it out of `astra.yaml`. Either fix the script, ask the user in prose if reachable, or log the failure to `open-questions.md` and stop.
+
+## Survey signals (entry into RUN)
+
+- `astra.yaml` has recipes and validates ⇒ ready to run
+- `lc status --universe baseline` returns all `ok` ⇒ RUN done; the next iteration surveys and advances to COMPARE
+
+## Notes
+
+- The runner backend (Docker / local / SLURM) comes from the project's target configuration — `~/.lightcone/config.yaml` and `.lightcone/lightcone.yaml`. RUN does not need to choose; the runner picks based on config.
+- For long-running computations, the script's stdout / stderr stream into the result directory's log file. The iteration should use the Monitor tool on the log file to stream events (each stdout line surfaces as a notification), not poll `lc status` repeatedly. For one-shot waits, Bash with `run_in_background` notifies on completion.
+- **Commit the materialized results' state when RUN settles.** The actual `results/` artifacts are gitignored heavy data, but the run-level outcome (which outputs reached `ok`, any failures logged) is worth a commit so the next iteration can read `git log` to know RUN landed.
diff --git a/claude/lightcone/skills/lc-from-paper/references/specify.md b/claude/lightcone/skills/lc-from-paper/references/specify.md
new file mode 100644
index 00000000..eba368dc
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/references/specify.md
@@ -0,0 +1,198 @@
+# SPECIFY — fill the stub `astra.yaml`, two passes per sub-analysis
+
+Read the stub `astra.yaml` from ARCHITECT and fill in `decisions:`, `prior_insights:`, `findings:` per sub-analysis, weaving the existing narrative with `astra-anchor:` references as entries land. SPECIFY is the **first material-disagreement seam** — paper-vs-code conflicts surface here, and they're often the highest-value moments for the user to weigh in on at REVIEW.
+
+SPECIFY is what a ralph iteration does when the workdir signals "stub `astra.yaml` present + sub-analyses' `decisions:` / `prior_insights:` / `findings:` blocks still empty." Iterations run detached in tmux; the user isn't reachable interactively, so the canonical-resolution default (code wins where paper and code disagree on a material choice) applies and disagreements are logged to CLAUDE.md's **Paper-vs-code disagreements** section plus `open-questions.md` for REVIEW close-out.
+
+The structure runs **two passes per sub-analysis** (paper, then code, when code exists), then iteration-boundary review. The two passes are the cross-check: the paper pass authors what the paper says; the code pass surfaces where the code says something different; the difference is gold (it's where the reproduction has to make a decision).
+
+Per-sub-analysis work is parallelizable when sub-analyses are independent. Each sub-analysis's two passes (paper, then code) run sequentially within that sub-analysis; across sub-analyses the iteration can fan out parallel work as one-level-deep sub-agents from inside its main session. When SPECIFY needs paper- or code-side context, Grep into `work/reference/source/` / `document.md` for paper text or read targeted modules under `work/reference/code/`; the structural index at `work/reference/index.json` and the code inventory at `work/reference/code-index.md` give you the orientation to know where to look. Don't try to absorb the paper or code whole.
+
+## Inputs
+
+- `astra.yaml` — the stub from ARCHITECT (sub-analyses, inputs, outputs, narrative; empty `decisions:` / `prior_insights:` / `findings:` blocks)
+- `constitution.md` — Goal (scope), Fidelity intent, Quality bar
+- `CLAUDE.md` — Rules; **Paper-vs-code disagreements** for prior-iteration entries
+- `work/reference/index.json` — paper-extraction's structural index: figures, tables, section outline, citations. The `citations:` block maps each cited paper's BibTeX key (Path A) or synthetic `<lastname>_<year>` key (Path B) to `{locations, citation, doi}`. SPECIFY uses this to write each `prior_insights:` placeholder's `doi:` so LITERATURE knows which paper to fetch.
+- `work/reference/code-index.md` (when code present) — code inventory: module map, candidate decisions with file:line, entry-points, data dependencies, gotchas.
+- `work/reference/source/` (Path A) or `work/reference/document.md` (Path B) — paper text. Grep into for specific facts; read targeted spans by offset/limit when you need more context. Don't re-read whole.
+- `work/reference/figures/`, `work/reference/tables/`, `work/reference/metadata.json` — extracted artifacts (Path B only)
+- `work/reference/code/` (if present) — original code, canonical reference for numerics + method. Read the modules that `code-index.md` points at for the sub-analysis you're filling.
+- `work/notes/notes.md` — user-supplied context (read by every iteration if present)
+
+## Outputs
+
+- `astra.yaml` — **filled form**: each sub-analysis's `decisions:` populated with decision-level `rationale:` prose plus options (the paper's choice is identified by `default:`); `findings:` populated as full `Insight` blocks with paper-anchored `evidence:` (the target paper's DOI + `quote: {exact, prefix, suffix}` + `location: {page: N}`); `prior_insights:` populated as citation **placeholders** — each a syntactically-complete `Insight` (`id`, `claim`, `created_at`, `evidence: [{id, doi}]`) whose placeholder Evidence carries the cited paper's DOI looked up from `work/reference/index.json#citations[<cite-key>].doi` **but no `quote:` selector yet** — LITERATURE fills those in. Each option that draws on a placeholder cites it via `Option.insights: [<insight_id>, ...]` (the back-reference that links options to prior_insights in the ASTRA grammar). `narrative:` keys updated to weave `astra-anchor:` references into prose as entries land. `astra validate astra.yaml` returns clean (Evidence with `doi:` and no `quote:` is structurally valid at this stage); `astra validate astra.yaml --verify-evidence` runs after LITERATURE has authored the quotes.
+- `universes/baseline.yaml` — selects the paper's choices (where paper and code disagree per the canonical-resolution rule, see "Material conflicts" below)
+- `implementation-notes.md` — concise practical guidance for the IMPLEMENT phase: tricky algorithms, numerical gotchas, data-format quirks, things the spec can't capture. Bullets, not essays.
+- `targets/targets.md` — small target ledger COMPARE consumes: per output (already declared by ARCHITECT), a brief entry with type, priority, paper value, expected match criteria, and the path to the reference figure / table / metric (when applicable, copy the reference file into `targets/` so the directory is self-contained)
+- `CLAUDE.md` updates — append entries to **Paper-vs-code disagreements** for each material conflict surfaced
+- `constitution.md` updates — Open dimensions when something material warrants user ratification at REVIEW
+
+## Substrate skills to invoke
+
+- **`/narrative`** — narrative authoring (any of the five `narrative.{summary,inputs,methods,findings,outputs}` keys, plus decision `rationale:` fields) is owned by the narrative skill. Invoke it during the **paper pass** when authoring or extending narrative prose. The narrative skill teaches reserved entity names, the tree-path anchor grammar, the conditional-narrative requirement (which keys are required when), the five-key authoring order, paper-reproduction fidelity discipline, and the new downstream-consumer discipline (lightcone-cli#108). Do not duplicate that content.
+
+Your responsibility in this phase is the **content**: build out the `decisions:` / `prior_insights:` / `findings:` for each sub-analysis (each with its own evidence shape — detailed below), and weave `astra-anchor:` references back into the narrative as entries land. ARCHITECT already settled the structure.
+
+## The two-pass-per-sub-analysis structure
+
+For each sub-analysis (parallelizable across independent sub-analyses):
+
+### Pass A — paper pass
+
+Read the paper's section(s) covering this sub-analysis. Author:
+
+1. **`decisions:`** — every choice in this sub-analysis where a different defensible option could plausibly shift a numerical result: algorithmic methods, thresholds, statistical approaches, data selection criteria, calibration choices. Use `when`, `incompatible_with`, and `requires` constraints for non-independent decisions.
+
+   For each decision, the paper-pass authors:
+   - **Decision-level fields:** `label:` (short human-readable name), `rationale:` (the paper's stated reasoning — use `/narrative` for the prose), `default:` (the option the paper actually selects), and `options:` (the map of option entries below).
+   - **Options:** the chosen option plus any sibling alternatives the paper discusses. Each option carries `label:` (required) and an optional `description:`. Per the 0.0.10 grammar, options do **not** carry their own `rationale:` or `evidence:` block — the decision's `rationale:` covers the reasoning; paper-text evidence flows through `findings:` (for the paper's own quantitative claims) or via `Option.insights` back-references into `prior_insights:` (for citation-backed support).
+   - **Option ↔ prior_insights linkage:** when the option's support derives from cited literature, list the relevant `prior_insights:` ids in `Option.insights: [<insight_id>, ...]`. The placeholder block under `prior_insights:` (authored in step 2 below) is the back-end of this link — LITERATURE fills in the verbatim cited-paper quote later. **Scope rules** (astra-tools ≥ 0.2.9): bare ids resolve **node-locally only** — the prior_insight must be declared in the same sub-analysis as the option. For a citation declared at an ancestor scope, use explicit upward refs: `[../id]` for the parent, `[../../id]` for the grandparent, etc. (same `../` grammar as `Input.from` and `Decision.from`). The natural shape — declare each cited paper at the sub-analysis that uses it, reference with a bare id from same-scope options — keeps everything node-local and needs no `../`.
+
+   Read `.claude/guides/decision-guide.md` (in lightcone-cli's plugin bundle) for the full definition of what counts. **Only exclude pure tooling choices** (language, library, file format) and fixed constraints. A typical sub-analysis has 2–6 decisions; if a sub-analysis has fewer than 2, revisit `work/reference/index.json` and reconsider.
+
+   ```yaml
+   decisions:
+     <decision_id>:
+       label: "<short human-readable name>"
+       rationale: "<the paper's stated reasoning, weaving astra-anchors into prose>"
+       default: <chosen_option_id>
+       options:
+         <option_id>:
+           label: "<short name>"
+           description: "<optional longer description>"
+           insights: [<prior_insight_id>, ...]   # back-refs to prior_insights this option draws on
+   ```
+
+2. **`prior_insights:`** — for every `\cite{<key>}` (Path A) or rendered citation invocation (Path B) the paper invokes that bears on a decision in this sub-analysis, record a **placeholder**. The placeholder is a syntactically-complete `Insight` (`id`, `claim`, `created_at`, `evidence`) whose `evidence` array contains a single Evidence entry carrying the cited paper's `doi` but **no `quote:` selector** — LITERATURE fetches the cited paper, finds the supporting quote, and writes the resolved `quote: {exact, prefix, suffix}` (+ `location: {page: N}`) onto that Evidence entry. The decision↔insight linkage is the back-reference on the option (`Option.insights`, step 1 above), not a forward link on the insight. The placeholder shape:
+
+   ```yaml
+   prior_insights:
+     <insight_id>:
+       id: <insight_id>
+       claim: "<what the cited paper supports about the decision>"
+       created_at: "<SPECIFY-iteration ISO-8601 timestamp, e.g. 2026-05-11T09:00:00Z>"
+       evidence:
+         - id: <evidence_id>
+           doi: "<DOI from work/reference/index.json#citations[<cite-key>].doi>"
+           # quote: omitted at SPECIFY time — LITERATURE fills the TextQuoteSelector in
+   ```
+
+   Evidence with `doi:` and no `quote:` is structurally valid in 0.0.10 (`quote:` is optional on Evidence); the placeholder passes `astra validate` and waits for LITERATURE to fill the quote. `astra validate --verify-evidence` should only be run after LITERATURE has resolved every placeholder.
+
+   When the citation's DOI is unresolved (`citations[<key>].doi: null` — flagged in `extraction_warnings`), the placeholder still needs a `doi:` (Evidence requires exactly one of `doi` or `artifact`). In that case, omit the Evidence entry entirely or fall back to an artifact reference if the gap will be resolved internally — and log the unresolved citation to `open-questions.md` so the user can supply the DOI at REVIEW close-out. Don't pre-emptively fetch the cited paper or guess its content; LITERATURE does that with fresh context per paper.
+
+3. **`findings:`** — paper-level claims and quantitative results scoped to this sub-analysis. Each is a full `Insight` (`id`, `claim`, `created_at`, `evidence`) with at least one paper-anchored Evidence entry: `doi:` of the target paper itself + a verbatim `quote: {exact, prefix, suffix}` (TextQuoteSelector) + a `location: {page: N}` (FragmentSelector, page from the rendered PDF). For findings tied to a specific declared output, the Evidence may use `artifact: <output_id>` instead of (or in addition to) the DOI-based quote. Pull the verbatim claims for each output's expected value from the paper text + the result loci in `work/reference/index.json`.
+
+   ```yaml
+   findings:
+     <finding_id>:
+       id: <finding_id>
+       claim: "<the paper's quantitative claim, 1–2 sentences>"
+       created_at: "<ISO-8601 timestamp>"
+       evidence:
+         - id: <evidence_id>
+           doi: "<target paper's DOI>"
+           quote:
+             exact: "<verbatim quote from the paper>"
+             prefix: "<~20–100 chars BEFORE the quote, real surrounding text>"
+             suffix: "<~20–100 chars AFTER the quote, real surrounding text>"
+           location: { page: <N> }
+   ```
+
+4. **Weave `astra-anchor:` references into the existing narrative.** ARCHITECT wrote `narrative:` prose without anchors because the entries didn't exist. Now they do — extend the narrative to point at the new `decisions:` / `prior_insights:` / `findings:` entries via the tree-path anchor grammar. Use `/narrative` for this pass; it carries the discipline.
+
+5. **Verify finding quotes against the paper source by Grep.** For each `findings:` Evidence entry with a `quote:`, Grep the paper source to confirm the `exact:` text is verbatim and the `prefix:` / `suffix:` are real surrounding text. `astra validate --verify-evidence` will run the deterministic check across every quote later (after LITERATURE resolves the `prior_insights:` placeholders); a manual Grep now catches typos and paraphrases before the code pass.
+
+### Pass B — code pass (when `work/reference/code/` exists)
+
+Read the code that implements this sub-analysis (`work/reference/code-index.md`'s natural-decomposition rows point at the relevant modules / scripts). Augment / amend:
+
+1. **Code-as-canonical material disagreements.** For each decision authored in the paper pass, locate its implementation in the code. Where paper and code disagree:
+   - **Material** = a different choice would plausibly change a numeric result the paper reports.
+   - **Stylistic / cosmetic / pure-tooling** = not material; record in `implementation-notes.md` and move on.
+
+   For **material** disagreements: take **code as canonical** per the canonical-resolution rule (the iteration runs detached; the user isn't reachable interactively). Append the conflict to CLAUDE.md's **Paper-vs-code disagreements** section AND to `open-questions.md` so the user sees it at REVIEW close-out, with the verbatim paper quote + the `path:line` code anchor + a plausible-impact one-liner ("changes the BAO peak amplitude by ~5%"). Let `universes/baseline.yaml` select the code's method. Preserve both options in the `astra.yaml` `decisions:` entry; the user can flip the baseline at REVIEW close-out.
+
+2. **Code-revealed insights and findings.** Things the code does that the paper doesn't describe (a calibration version, a cut stricter than stated, a hyperparameter the paper compressed). These earn `findings:` entries with Evidence using `artifact: <output_id>` (referencing a declared output) plus an optional `source_commit:` (the git SHA that produced it). When the insight isn't tied to a formal output, drop it into `implementation-notes.md` as a bullet rather than synthesizing a degenerate finding.
+
+3. **Decision-option augmentation.** Where the code reveals an option the paper didn't mention but is defensible (a sibling implementation alternative used in the codebase or referenced in a comment), add it as a sibling option to the relevant `decisions:` entry. Do not pre-emptively author every code variant; only the ones that bear on a real choice.
+
+### Reviewing prior SPECIFY work as part of survey
+
+There is no separate review phase. Every iteration that enters and finds a SPECIFY-filled sub-analysis on disk reads it critically before doing anything else. If you see real issues — missing decision, paraphrased quote, dropped disagreement, broken anchor — fix them inline, commit (`specify: fix <sub-analysis-id> <what>`), and exit. When a fresh-context read finds nothing to fix in a sub-analysis, the iteration moves on (next sub-analysis, or next phase if every sub-analysis is clean).
+
+The cross-check questions on entry: are the decisions covering everything material? Are the evidence quotes verbatim? Are the findings actually traceable to the paper or code? Did any material disagreement get silently dropped?
+
+#### What to check
+
+1. **Decision coverage.** Does this sub-analysis's `decisions:` block cover every choice in the paper-side index's decision clusters? Cosmetic / pure-tooling choices should NOT be decisions; anything material that's missing should be added.
+2. **Decision options.** Each decision has the option the paper selects (named in `default:`) plus any sibling alternatives the paper discusses or the code reveals. The decision-level `rationale:` is grounded in the paper's stated reasoning (or the code's, where canonical-resolution applied). Per the 0.0.10 grammar, options do not carry per-option `rationale:` or `evidence:`; cited support is back-referenced via `Option.insights` into a `prior_insights:` entry.
+3. **Evidence verification.** Every `findings:` Evidence entry uses `TextQuoteSelector` with a verbatim `exact:` quote, real surrounding-text `prefix:` / `suffix:`, and a `location: {page: N}` (1-indexed). Quotes that are paraphrased or whose `prefix:` / `suffix:` are editorial parentheticals will fail `--verify-evidence`. `prior_insights:` placeholders intentionally have `evidence: [{id, doi}]` without a `quote:` at this stage — LITERATURE authors the quotes — so do not flag a missing quote on placeholder entries. After LITERATURE resolves the placeholders, run `astra validate astra.yaml --verify-evidence`.
+4. **Findings traceability.** Each `findings:` Insight's `evidence:` resolves either to a real paper claim (target-paper DOI + verbatim `quote:` + page) or to a real declared output via `artifact: <output_id>` (with optional `source_commit:` and `snapshot:`).
+5. **Material-disagreement surfacing.** Where paper and code disagree on a material choice, the spec records both options under the relevant `decisions:` entry, `universes/baseline.yaml` selects the code's option (canonical-resolution default), and the conflict is appended to CLAUDE.md's *Paper-vs-code disagreements* section plus `open-questions.md` for the user to resolve at REVIEW close-out. Flag any material disagreement that got silently dropped, that didn't make it into the disagreements log, or where the baseline picked the paper without the canonical-resolution rule applying.
+6. **Narrative anchors.** The sub-analysis's `narrative:` weaves `astra-anchor:` references to the new `decisions:` / `prior_insights:` / `findings:` entries — the tree-path grammar must be valid, and entries actually exist at the referenced paths.
+7. **`narrative:` voice fidelity.** Hedges and qualifiers from the paper survive (per the narrative skill's discipline). Editorial commentary added beyond what the paper supports gets flagged.
+8. **No synthetic data.** Unless the paper itself uses synthetic data, every input has a real acquisition source — no mock / synthetic substitutes anywhere in the sub-analysis's inputs, decisions, or implementation-notes.
+
+Apply fixes inline as you find them — `astra.yaml`, `universes/baseline.yaml`, `implementation-notes.md`, the disagreements log in CLAUDE.md as needed. The diff against the prior commit is the record of what changed. After any change to `astra.yaml`:
+
+```bash
+astra validate astra.yaml
+astra validate astra.yaml --verify-evidence  # after LITERATURE has resolved the prior_insights placeholders
+```
+
+Commit the diff (`specify: fix <sub-analysis-id> <what>`) and exit.
+
+#### What NOT to do
+
+- **Don't flag missing `recipes:`.** Recipes are IMPLEMENT's, not SPECIFY's.
+- **Don't re-read the entire paper.** Use Grep on `work/reference/source/` (or `document.md`) for the specific claims you want to verify; lean on `work/reference/index.json`.
+- **Don't declare the sub-analysis done in the iteration where you landed fixes.** The next fresh-context iteration reads it cold; if nothing needs fixing, it moves on, which is the "done" signal.
+
+When every sub-analysis is clean and the SPECIFY-final outputs (target ledger, baseline universe, implementation-notes) are in place, SPECIFY produces its final artifacts:
+
+## Target-ledger output
+
+After every sub-analysis is filled and self-reviewed, write `targets/targets.md` as a small ledger COMPARE consumes. Only an index, not a derivation of the spec; the depth lives in `astra.yaml`. For each `outputs:` entry across all sub-analyses (already declared by ARCHITECT), a brief entry:
+
+- What it is (one line); the reference file's path (relative to `targets/` when the file is copied into `targets/`, or pointing at `work/reference/figures/...` when not)
+- Type: `metric` | `figure` | `table`
+- Priority: `primary` | `secondary` (from ARCHITECT's tagging)
+- Expected value / trend (paper-side); how to judge a match (numerical tolerance for metrics; shape / axis ranges / key features for figures; specific values for tables)
+- Spec home: which `analyses.<sub-id>.outputs.<output-id>` entry in `astra.yaml` this target maps to, so COMPARE can find the reproduced result at `results/<universe>/<output_id>/`
+
+Copy reference figure / table files from `work/reference/` into `targets/` so COMPARE has a self-contained reference set. For Path A, files are in `work/reference/source/` (extract by `\includegraphics{}` filename); for Path B, in `work/reference/figures/` / `work/reference/tables/`.
+
+Out-of-scope targets stay in `targets/targets.md` with an explicit reason and should not be forced into the spec.
+
+---
+
+## Other rules
+
+- **Do NOT add executable implementation code or invented run commands.** Do add concise provenance / recipe descriptions where ASTRA fields support them, especially for paper-derived calculations, figure generation, imported constants, and values that IMPLEMENT will need to regenerate.
+- **Equation and section numbers must match the rendered paper / PDF**, not a naïve count of TeX blocks or markdown headings. When citing "eq. N" or "§N", find the equation or heading by content in the rendered paper and use the printed number.
+- **Validate** with `astra validate astra.yaml` after each pass.
+- **Targeted reads, not whole-paper absorption.** Use `work/reference/index.json` and `work/reference/code-index.md` for structural lookups; Grep into `work/reference/source/` (Path A) or `work/reference/document.md` (Path B) for specific verbatim quotes; read targeted code modules under `work/reference/code/` for canonical method details. Don't re-read the whole paper or whole code base.
+- **The narrative skill is the prose author, not the structure author.** SPECIFY weaves anchors into the prose ARCHITECT wrote — the structural surface is fixed, the anchored references are SPECIFY's contribution.
+
+## Survey signals (entry into SPECIFY)
+
+- `astra.yaml` exists with stub form (sub-analyses + inputs + outputs + narrative; empty decisions / prior_insights / findings) ⇒ ready to specify
+- For each sub-analysis: `decisions:` populated with decision-level `rationale:` + options (paper's choice at `default:`); `findings:` populated as full Insight blocks with paper-anchored Evidence (DOI + `quote: {exact, prefix, suffix}` + `location: {page}`); `prior_insights:` populated as citation placeholders (`id`, `claim`, `created_at`, `evidence: [{id, doi}]` with `quote:` omitted — LITERATURE fills the quotes next); `Option.insights` back-references wired up where options draw on placeholders ⇒ paper pass done
+- For each sub-analysis: when `work/reference/code/` exists, code-pass material-disagreement entries land in `decisions:` (with both options) and `universes/baseline.yaml` selects the canonical-resolution choice; `implementation-notes.md` carries non-material gotchas ⇒ code pass done
+- For each sub-analysis: a fresh-context iteration reads the slice and finds nothing to fix ⇒ that sub-analysis is done; the next iteration moves on
+- `astra validate astra.yaml` returns clean (placeholders whose Evidence carries `doi:` without `quote:` are valid at this stage) ⇒ structural side validated; `--verify-evidence` waits until LITERATURE has authored the `quote:` + `location:` selectors
+- `targets/targets.md` exists with each entry mapped to a spec home ⇒ target-ledger done
+- `implementation-notes.md` exists ⇒ practical-guidance side done
+- All of the above ⇒ SPECIFY complete; proceed to IMPLEMENT
+
+## Notes
+
+- **Material disagreements** are appended to CLAUDE.md's **Paper-vs-code disagreements** section AND `open-questions.md`. CLAUDE.md is the at-a-glance summary every iteration sees; `open-questions.md` is the user-resolution accumulator. Both lead to the same place: the user resolves at REVIEW close-out.
+- **The narrative skill is the prose author, not the structure author.** SPECIFY's job is content correctness; `/narrative` invocation comes during the paper pass when authoring or extending the narrative prose to weave in anchor references.
+- **The target ledger is a derivation, not a separate phase's output.** Treat `targets/targets.md` as a small index produced alongside the filled `astra.yaml`, not a heavyweight artifact. The depth lives in `astra.yaml`'s `outputs:` / `findings:` / `decisions:`.
+- **Two-pass discipline is the cross-check.** Skipping the code pass (when code exists) loses the canonical-resolution surface and lets paper-vs-code material disagreements slip through. The fresh-context review can recover *some* of these but not all — the disciplined sequence (paper → code → review) catches more.
+- **Per-sub-analysis parallelism is opt-in.** When sub-analyses are independent (no shared decision blocks, no cross-sub-analysis findings), the iteration can fan out one-level-deep sub-agents (one per sub-analysis from inside its main session) to run their passes in parallel. When they share material decisions or findings (rare), serialize across iterations.
+- **Commit per sub-analysis as it lands.** Each sub-analysis's filled-in `astra.yaml` slice + its targets/implementation-notes/baseline updates earn one commit; subsequent fix passes commit separately. The next iteration reads `git log` to track progress; small commits keep the trail readable.
diff --git a/claude/lightcone/skills/lc-from-paper/templates/CLAUDE.md b/claude/lightcone/skills/lc-from-paper/templates/CLAUDE.md
new file mode 100644
index 00000000..c8ecaa25
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/templates/CLAUDE.md
@@ -0,0 +1,36 @@
+# <paper-slug>
+
+Reproduction of **<paper title>** (<arXiv ID>). DOI: <doi>. One-line subject: <e.g. "BAO scale measurement from DESI DR1">.
+
+The driving document for this reproduction is [`constitution.md`](constitution.md) — Goal, Fidelity intent, Scope, Quality bar, Evidence, Open dimensions. Every ralph iteration reads it on entry. This file (`CLAUDE.md`) is the auto-loading walk-up: rules + durable findings that stay useful past the reproduction (Open opportunities for future tightening, Paper-vs-code disagreements, pointers).
+
+## Rules
+
+- **Code-as-canonical when `work/reference/code/` exists.** Every iteration that touches a sub-analysis reads the relevant code first. Where paper and code disagree, code is canonical for numerics, plotting, and method. When `work/reference/code/` is absent, paper is the only anchor — implement fresh from the spec, expect slower convergence, surface gaps honestly to the user rather than dressing them up.
+- **Never block on `AskUserQuestion` mid-iteration.** Each ralph iteration runs in a fresh detached session; the user isn't reachable interactively. Append questions to `open-questions.md` and continue with the best-judgment default. The user resolves accumulated questions at REVIEW close-out (which runs in the user's main session).
+- **arXiv-LaTeX-first acquisition.** PDF + Docling is a fallback for non-arXiv only.
+- **`astra validate --verify-evidence`** is the fidelity gate; evidence quotes must match source PDFs.
+- **No synthetic data.** Unless the paper itself uses synthetic data as input, every input dataset must be downloaded or queried from its real source.
+- **Commit as you go.** Small, descriptive commits per significant change. The git log is the chronological trail of the reproduction; the next iteration reads it to know what landed.
+- **Updates go in code, files, and the accumulators in `constitution.md` and below — not progress notes scattered in the body.** Discoverable updates; the next iteration finds what changed by inspecting the system.
+
+## Paper-vs-code disagreements
+
+Material disagreements between paper and code, logged here as iterations find them. Code is canonical for numerics, plotting, and method (per the rule above); both options are preserved in `astra.yaml` as decision alternatives. Each entry summarizes the disagreement and points to the corresponding decision so any iteration can see them at a glance. Surfaced to the user at REVIEW close-out (or earlier if they're around).
+
+- (none yet)
+
+## Open opportunities
+
+Gaps that could be tightened in a future pass, surfaced by COMPARE iterations and persisted past close-out. Each carries a sense of leverage. Format: `<area> — <what could be tightened> — <leverage>`. A future Claude Code session walking into this directory reads this list and knows where another loop would have the most return. Empty until a COMPARE iteration surfaces one:
+
+- (none yet)
+
+## Pointers
+
+- [`constitution.md`](constitution.md) — Goal, Fidelity intent, Scope, Quality bar, Evidence, Open dimensions. The ralph loop's driving document.
+- `open-questions.md` — accumulated questions from iterations, resolved in REVIEW.
+- `work/reference/index.json` — paper structural index (figures, tables, outline, citations with DOIs); the starting surface for any "where in the paper does X happen" lookup.
+- `work/reference/code-index.md` — code inventory (when code present): module map, candidate decisions with file:line, entry-points, gotchas.
+- `work/cited/<doi-slug>/` — per-cited-paper substrate produced by LITERATURE for `prior_insights:` resolution.
+- <any paper-specific conventions or warnings the user surfaced during the interview>
diff --git a/claude/lightcone/skills/lc-from-paper/templates/constitution.md b/claude/lightcone/skills/lc-from-paper/templates/constitution.md
new file mode 100644
index 00000000..ef278951
--- /dev/null
+++ b/claude/lightcone/skills/lc-from-paper/templates/constitution.md
@@ -0,0 +1,45 @@
+---
+status: active
+---
+
+# <paper-slug> — reproduction constitution
+
+The driving document for the ralph loop reproducing <paper title> (<arXiv ID>, DOI <doi>). Every iteration reads this on entry to know what "done" looks like. The body **sharpens slowly** — only when something fundamental shifts (target moves, scope opens or fences, a material disagreement makes us re-think a sub-analysis); Open dimensions is updated each iteration as decisions worth user ratification surface. Durable findings that stay useful past the reproduction — paper-vs-code disagreements, open opportunities for future tightening, pointers to substrate — live in `CLAUDE.md`.
+
+## Goal
+
+<What "done" looks like for this reproduction. Concrete: which targets, what verdict against them, what validation passes. E.g.: "A complete `astra.yaml` with recipes that produce reproduced versions of <list of targets>, validated by `astra validate astra.yaml --verify-evidence`, with `comparison-report.yaml` verdict `pass` against the targets in `targets/targets.md`.">
+
+**Fidelity intent.** <The user's prose answer from ORIENT to "what do you want out of this stretch, given what you have to spend on it" — captured verbatim or in close paraphrase. Carries both the aesthetic dimension (what "good enough" looks like) and the pragmatic dimension (compute, tokens, wall-clock budget). E.g.: "just checking if the analysis is tractable — an afternoon of compute", "Figure 3 must be right; the rest can stay rough — overnight", "full fidelity on the BAO fit, baseline elsewhere — a few days", "every primary and secondary target lining up within stated tolerance, no hard deadline". Each iteration reads this when sizing its next move; COMPARE grades opportunities against it. Static once approved at ORIENT; the user can sharpen at any REVIEW.>
+
+## Scope
+
+**In scope:** <targeted figures / tables / numbers, methodological span being reproduced.>
+
+**Out of scope:** <explicit exclusions, fenced from drift.>
+
+## Quality bar
+
+What the quality bar looks like for *this* paper. The level primary-target outputs aim for when the fidelity intent calls for it:
+
+- <e.g. "BAO fit posteriors match the paper's Figure 4 within 1σ across the full damping prior range">
+- <e.g. "magnitude cuts and selection match the code's defaults exactly; any deviation is recorded as a paper-vs-code disagreement with both options preserved">
+- <e.g. "every prior insight cites a real verbatim quote from the cited paper">
+
+This is the ceiling; the fidelity intent determines which outputs need to actually reach it.
+
+## Evidence
+
+The substrate this reproduction is built against — the canonical sources iterations consult:
+
+- **Paper:** `work/reference/{paper.pdf, source/ or document.md, index.json, astra.yaml}` (from `/paper-extraction` during ORIENT). The `index.json#citations` block carries each cited paper's resolved DOI for LITERATURE.
+- **Code:** `work/reference/code/` (cloned during ORIENT; scan inventory at `work/reference/code-index.md`).
+- **Paper DOI:** <doi>
+- **arXiv ID:** <id> (if applicable)
+- **Code repo URL:** <url>
+
+## Open dimensions
+
+Decisions worth surfacing to the user — places the reproduction could go differently and the call benefits from human ratification. Iterations append here when something material comes up that isn't itself a paper-vs-code disagreement (those go to `CLAUDE.md`'s disagreements log instead). The user resolves these at REVIEW close-out, or earlier if they're around.
+
+- (none yet)
diff --git a/claude/lightcone/skills/lc-new/SKILL.md b/claude/lightcone/skills/lc-new/SKILL.md
index 78073ad9..c2db4c63 100644
--- a/claude/lightcone/skills/lc-new/SKILL.md
+++ b/claude/lightcone/skills/lc-new/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: lc-new
-description: Create a new ASTRA analysis project with integrated literature support. Scope the research question through conversation, structure outputs and decisions, search for and extract evidence from scientific papers, and build a complete astra.yaml specification. Use when starting a new analysis, when the user says "new project", "new analysis", or "scope". Triggers on "new", "scope", "research question", "start analysis".
+description: Use this skill whenever the user starts a new ASTRA analysis from a research question — scoping the question, structuring inputs and outputs, identifying decisions through literature, and landing astra.yaml + project CLAUDE.md. Triggers on verbs (`new`, `start`, `scope`) combined with nouns (`analysis`, `project`, `question`, `research`) — e.g. "new analysis", "start project", "scope research question" — even if the user doesn't say "project" explicitly. Don't use this for working inside an existing ASTRA project; this is for fresh scoping only.
 allowed-tools: Read, Write(astra.yaml), Write(universes/*), Write(CLAUDE.md), Edit(astra.yaml), Edit(universes/*), Edit(CLAUDE.md), Glob, Grep, Bash(astra:*), Bash(lc:*), WebSearch, WebFetch, AskUserQuestion, Agent
 ---
 
@@ -8,11 +8,6 @@ allowed-tools: Read, Write(astra.yaml), Write(universes/*), Write(CLAUDE.md), Ed
 
 Create a new ASTRA analysis project through conversation. Build the spec iteratively -- write to `astra.yaml` after each phase so the user sees progress. Literature search and decision identification happen in distinct phases -- talk first, then extract papers, then identify decisions informed by both conversation and literature.
 
-## References
-
-- [ASTRA Reference](../../guides/astra-reference.md) -- spec structure, decision identification, recipes, universes
-- [lightcone-cli Reference](../../guides/lightcone-cli-reference.md) -- `lc` workflow for the implementation phase that follows scoping
-
 ## Setup
 
 1. Read `astra.yaml` if it exists (to understand context or avoid overwriting)
@@ -40,7 +35,7 @@ Stage banner: ANALYSIS STRUCTURE
 
 > "Walk me through your analysis step by step. What goes in, what comes out at the end?"
 
-**Guidance on sub-analyses:** Analyses should only be split into multiple sub-analyses if each sub analysis genuinely has materially different inputs and outputs, and if the scope may be too broad if there is just one analysis; we overall want a sub-analysis to feel like it should genuinely be a self-contained product. For example, training + evaluation would typically be one analysis, because the product would be the trained and validated neural network estimator. When in doubt, opt for a single analysis at this stage. If it does need to be multi-stage, ask the user for confirmation and how to split it. For multi-stage analyses, make sure you confirm stage boundaries. See `.claude/guides/astra-reference.md` for YAML structure and sub-analysis guidance.
+**Guidance on sub-analyses:** Analyses should only be split into multiple sub-analyses if each sub analysis genuinely has materially different inputs and outputs, and if the scope may be too broad if there is just one analysis; we overall want a sub-analysis to feel like it should genuinely be a self-contained product. For example, training + evaluation would typically be one analysis, because the product would be the trained and validated neural network estimator. When in doubt, opt for a single analysis at this stage. If it does need to be multi-stage, ask the user for confirmation and how to split it. For multi-stage analyses, make sure you confirm stage boundaries. Invoke `/astra` for YAML structure and sub-analysis guidance.
 
 **One output per output.** Each output should be a single metric, a single plot, or a single artifact. Do not bundle multiple metrics into one output (e.g., "performance_metrics" containing accuracy, F1, and AUC). Each of those is its own output. Same for plots -- one figure per output.
 
@@ -79,7 +74,7 @@ Write extracted prior insights to astra.yaml immediately. Synthesize them by top
 
 ### Decision Identification
 
-Use the conversation and literature to identify decisions. Apply the decision criteria from [astra-reference.md](../../guides/astra-reference.md):
+Use the conversation and literature to identify decisions. Apply the decision criteria from `/astra` (Decisions section):
 
 - What could be done differently and still be defensible?
 - Where did papers disagree or compare alternatives?
@@ -118,6 +113,8 @@ Stage banner: FINALIZING
 astra universe generate -n baseline
 ```
 
+Generate only `baseline` unless the user explicitly asks for additional universes.
+
 ### Populate Narrative
 
 Replace the TODO entries in `astra.yaml`'s `narrative:` block now that structure is stable: `summary` (one-paragraph framing), `methods` (decisions and sub-analyses), `inputs`, `outputs`. Use `#path.to.element` anchors for cross-references. Leave `findings` as TODO until results exist.
diff --git a/claude/lightcone/skills/narrative/SKILL.md b/claude/lightcone/skills/narrative/SKILL.md
new file mode 100644
index 00000000..4cb3c5d7
--- /dev/null
+++ b/claude/lightcone/skills/narrative/SKILL.md
@@ -0,0 +1,228 @@
+---
+name: narrative
+description: >
+  Authors prose throughout an `astra.yaml` — analysis-level
+  `narrative:` blocks (five fixed keys: `summary`, `findings`,
+  `methods`, `inputs`, `outputs`), decision `rationale:` fields, and
+  shorter `description:` / `notes:` prose on individual entities. The
+  five-key narrative is the most substantive case; the same
+  architectural and syntactical frame applies wherever prose appears
+  in the spec.
+  Always written against an existing `astra.yaml`; what differs
+  between modes is the second source paired with the spec — an
+  authoritative text (paper reproduction), project artifacts
+  (retrofit), or dialogue with the user (co-drafting). Triggers on
+  "narrative", "draft the narrative", "narrate this analysis",
+  "rationale for this decision", "write the summary", "describe this
+  input", or any request for reader-facing prose keyed off an
+  `astra.yaml`.
+---
+
+# narrative
+
+This skill covers prose authoring across an `astra.yaml`. The prose surfaces are:
+
+- **Analysis `narrative:` blocks** — five keys (`summary`, `inputs`, `methods`, `findings`, `outputs`) on each analysis and sub-analysis.
+- **Decision `rationale:` fields** — one paragraph per decision.
+- **Per-entity prose** — shorter `description:` / `notes:` on individual inputs, outputs, options, insights.
+
+ASTRA's structural content surfaces alongside the prose in renderers like lightcone-ui. **Prose does not duplicate the structure** — it cites into it. An anchor is a citation; a sentence pointing to a decision is a small argument; prose is the layer where decisions, sub-analyses, findings, and outputs become a connected story.
+
+## Modes
+
+Prose cites the spec's structure (decisions, findings, outputs, sub-analyses) by anchor, so the structure must exist when the prose lands: write the spec first, write both concurrently, or revise narrative after spec changes settle.
+
+There are three modes, distinguished by what's available beyond the spec itself. Every mode draws on the under-construction `astra.yaml`; what differs is the **second source** paired with it.
+
+| Mode | Second source | Status | Reference |
+|---|---|---|---|
+| **Paper reproduction** | An authoritative text source (paper, thesis, technical report, …) | Ready | [`references/paper-reproduction.md`](references/paper-reproduction.md) |
+| **Retrofit** | Project artifacts — code, notebooks, fibers, commit history | Stub | [`references/existing-analysis.md`](references/existing-analysis.md) |
+| **Co-drafting** | The user, in conversation | Stub | [`references/co-drafting.md`](references/co-drafting.md) |
+
+If the second source isn't obvious from context, ask: is there an authoritative text (paper, thesis, technical report) to draw from? If not, are we harvesting from existing artifacts, or working from the user's own framing? Hybrid is allowed — a reproduction with co-drafted extensions, a retrofit with co-drafted gap-filling.
+
+The rest of this file is the mode-independent substrate every reference relies on. Read it through, then open the matching reference.
+
+---
+
+## The five keys
+
+| Key | What it carries | Required when |
+|---|---|---|
+| `summary` | Question, scope, headline shape — the only key without a structural peer. | optional in the schema, but should always exist |
+| `inputs` | Provenance — the data the analysis rests on. | `Analysis.inputs` is non-empty |
+| `methods` | Pipeline walk; cite each decision and sub-analysis by anchor. | `Analysis.decisions` or `Analysis.analyses` is non-empty |
+| `findings` | Synthesis of declared findings; each cited by anchor. | `Analysis.findings` is non-empty |
+| `outputs` | Which artifacts were promoted, and where they go downstream. | `Analysis.outputs` is non-empty |
+
+`astra validate` enforces the right column. **Narrate what you declare:** if `findings:` is empty, `narrative.findings` should not appear. A stub analysis with only `summary` is valid.
+
+A decision's `rationale:` is its own one-paragraph slot — what was decided, the insight that motivated it (cite by anchor), and what the load-bearing alternative was and why it lost. The alternatives themselves live in the options structure.
+
+## Length
+
+1–3 paragraphs per key, at any level (root, sub-analysis, decision).
+
+Length is the mechanism that keeps analyses modular, not a style preference. **If references don't fit in three paragraphs, the analysis is too big — split it.** The narrative is a compressor; if it won't compress, split the thing being compressed.
+
+## Anchors
+
+Markdown link syntax with `#`-target, **tree-path-first** — same grammar as decision `from:` references.
+
+| Target | Anchor |
+|---|---|
+| Input | `#inputs.<id>` |
+| Output | `#outputs.<id>` |
+| Decision | `#decisions.<id>` |
+| Option within a decision | `#decisions.<id>.options.<opt>` |
+| Finding | `#findings.<id>` |
+| Prior insight | `#prior_insights.<id>` |
+| Sub-analysis (whole node) | `#analyses.<sub>` |
+| Element inside sub-analysis | `#<sub>.<category>.<id>` |
+| Parent scope (from a sub-analysis) | `#../decisions.<id>` |
+
+The sub-analysis form is **sub-analysis first, then category**: `#reconstruction.decisions.algorithm`, not `#decisions.reconstruction.algorithm`. References resolve relative to the hosting analysis; use `../` to escape to parent scope.
+
+Rules:
+
+- Anchor text is **authored prose**, not the raw id.
+- Inline references do the work of a citation; don't footnote or parenthesize.
+- One reference per idea. Stacking three on a sentence means the sentence carries too much.
+- Prior insights motivate decision options via `decisions.<id>.options.<opt>.insights:`. Findings cannot appear there (validator-enforced); if a finding motivates a decision, cite it from the decision's `rationale:` prose.
+
+### Reserved IDs
+
+These names cannot be used as entity IDs (they collide with the anchor grammar): `inputs`, `outputs`, `decisions`, `findings`, `prior_insights`, `analyses`, `options`, `content`, `narrative`. The validator rejects them.
+
+## Data flow
+
+Make the data-flow linkage navigable in the prose itself. Anchors are the trail — a reader follows the flow inline, without leaving the narrative.
+
+1. **`narrative.outputs` says where each output goes next.** A sub-analysis's outputs are usually consumed by other sub-analyses or roll up into root findings. When you write the `outputs` prose, name those downstream destinations by anchor. Example, in the `reconstruction` sub-analysis's `outputs` key:
+
+   > *"`xi_post_recon_lrg1` feeds [the post-reconstruction BAO fit](#analyses.bao_fit.outputs.bao_fit_post_iso_ap_lrg1) and supports the [headline detection finding](#findings.bao_detection_chi2_lrg1)."*
+
+   Anchor downstream consumers where you can. When no anchor is reachable from the current scope (typically a sibling sub-analysis), bare `<analysis>.<output>` text is acceptable.
+
+2. **The root narrative is the end-to-end view.** When the project has sub-analyses, the root analysis's `methods` (or `summary`) traces the pipeline from raw inputs to final outputs — as much overview as fits in a few paragraphs. The root is the place a reader can land cold and get the shape of the work; details telescope into the sub-analyses. A condensed example:
+
+   > *"raw catalogs → [reconstruction](#analyses.reconstruction) → [clustering](#analyses.clustering) → root [BAO fit](#outputs.bao_fit_post_iso_ap_lrg1)."*
+
+## Validation
+
+```sh
+astra validate astra.yaml
+```
+
+- **Broken references** → error. Anchor doesn't resolve to a real id.
+- **Uncited declared elements** → warning. Every declared finding, decision, output, and sub-analysis must be cited somewhere in the narrative tree. If an element genuinely isn't worth a prose mention, consider whether it should be declared at all.
+- **Conditional coverage** → error. The required-when rule above.
+
+## User presence
+
+Multi-turn back-and-forth → user present; use `AskUserQuestion` to clarify mode, scale, and any mode-specific framing before drafting. Single-shot or pipeline invocation → autonomous; make the reasonable default inference and note it inline on the narrative. Ambiguous → err on present and ask.
+
+---
+
+## Craft
+
+- **Economy.** Every sentence introduces a new idea or sharpens an existing one. Release real verbs: `conducted cross-correlation` → `cross-correlated`.
+- **Anchor text is prose, not an id.** `[the post-reconstruction catalogs](#analyses.reconstruction)`, not `[reconstruction](#analyses.reconstruction)`.
+- **One reference per idea.** Three anchors on one sentence means the sentence carries too much; split it or drop one.
+- **Specificity.** Names, numbers, references over generic claims.
+- **Arrive through content.** No "in this analysis we will describe…"; the content is the opening.
+
+### Real subjects, real verbs
+
+"We measure the BAO peak with the LRG sample" reads as agency. "The measurements of the BAO peak reveal a 7σ detection" reads as zombie-noun abstraction. The test: can you picture someone or something physically doing the verb? If not, rewrite.
+
+Valid subjects:
+
+- **We** — for decisions and actions ("we chose the Gaussian damping prior")
+- **The thing itself** — for states and properties ("the covariance is dominated by shot noise")
+- **Passive voice** — when the actor is obvious ("a redshift cut is applied")
+- **Results / data as epistemic subjects** — for what the data shows ("the measurement shows a 7σ peak"; "Figure 2 reveals…")
+- **Physics doing physics** — for physical processes ("lensing distorts shapes"; "higher-order effects produce B-modes")
+
+Anthropomorphized abstractions fail the test: "the methodology validates," "this analysis demonstrates," "the catalogue evolution follows." Rewrite to a real subject doing a real verb.
+
+## Anti-patterns (mode-independent)
+
+- **Wiki-style what-is framing.** "BAO is the baryon acoustic oscillation feature." A wiki summarizes; an ASTRA narrative points into reasoning. Replace with the load-bearing statement and an anchor: "we chose the Gaussian BAO damping prior over flat because flat admitted spurious minima — see [the prior comparison](#decisions.bao_damping_prior)."
+- **Decision-list paragraph.** "We made the following decisions: A, B, C." Cite each decision where it shapes the pipeline, not as recitation. Too many to weave coherently → the spec wants more sub-analyses.
+- **`summary` as primer.** Teaching what the field is. Readers arrive with context.
+- **Drafting `findings` on a sub-analysis with no declared findings.** Skip the key.
+- **Narrative-per-element.** Writing `narrative:` on findings, inputs, outputs, or insights. The five-key analysis narrative is the only home; per-element prose is `description` / `rationale` / `notes`.
+
+Mode-specific anti-patterns live in each mode's reference.
+
+---
+
+## Self-contained example
+
+A minimal (not necessarily valid) sketch showing how the blocks fit together. The point is the *shape*.
+
+```yaml
+id: example_analysis
+version: "0.1.0"
+name: "Example analysis"
+
+narrative:
+  summary: |
+    We measure <quantity> in <sample>.  The feature is
+    [detected at high significance](#findings.headline_detection) and
+    [exceeds prior precision by 1.2×](#findings.precision_improvement),
+    with [an anomalous feature at <location>](#findings.anomaly)
+    motivating follow-up.
+
+  inputs: |
+    Primary data are [the <dataset>](#inputs.primary_data); validation
+    uses [<mocks>](#inputs.validation_mocks).
+
+  methods: |
+    The pipeline runs in two stages.  [Preparation](#analyses.preparation)
+    ingests the raw catalog and produces [cleaned two-point statistics
+    ](#preparation.outputs.clean_stats).  [Fitting](#analyses.fitting)
+    consumes those statistics and fits model parameters.  Both stages
+    inherit the parent's [fiducial cosmology](#decisions.fiducial_cosmology)
+    so the distance-redshift relation is used end-to-end.
+
+  findings: |
+    Three findings constitute the result: a
+    [headline detection](#findings.headline_detection), a
+    [precision comparison with prior work](#findings.precision_improvement),
+    and [an anomalous feature](#findings.anomaly).  The anomaly is the
+    most-discussed qualitative feature.
+
+  outputs: |
+    Two artifacts are promoted to the top level:
+    [the final measurement table](#outputs.final_table) and
+    [the headline figure](#outputs.headline_figure), both produced by
+    [fitting](#analyses.fitting).
+
+decisions:
+  fiducial_cosmology:
+    label: "Fiducial cosmology"
+    rationale: |
+      Planck 2018-ΛCDM is the community reference; distance-redshift
+      conversion is downstream of this choice, and fixing it lets
+      results be compared directly to prior measurements.  Inherited by
+      [fitting](#analyses.fitting) so the end-to-end chain uses one
+      distance scale.
+    default: planck2018
+    options:
+      planck2018:
+        label: "Planck 2018-ΛCDM"
+      wmap9:
+        label: "WMAP9"
+        excluded_reason: "Superseded; no longer the community reference."
+```
+
+For a canonical reproduction narrative in context, see `Reproductions/DESI/desi-dr1-bao/astra.yaml` in the [LightconeResearch/Reproductions](https://github.com/LightconeResearch/Reproductions) repo.
+
+---
+
+## Now read the mode reference
+
+Open the reference file that matches the user's situation. Each carries the mode's draft order, mode-specific moves, critique pass, and mode-specific anti-patterns.
diff --git a/claude/lightcone/skills/narrative/references/co-drafting.md b/claude/lightcone/skills/narrative/references/co-drafting.md
new file mode 100644
index 00000000..1deee5b6
--- /dev/null
+++ b/claude/lightcone/skills/narrative/references/co-drafting.md
@@ -0,0 +1,79 @@
+# Co-drafting mode (stub)
+
+> **Status: under development.** Use paper reproduction (the default flow when a paper exists) when applicable. This file names what's distinct about co-drafting and the open questions; it isn't yet production guidance.
+
+The narrative is being drafted in dialogue with the user, against an existing-shape `astra.yaml`. There's no paper to harvest from and no body of code or fibers to mine; the spec's structure is the only artifact, and the user is the source for everything the structure doesn't already carry.
+
+This mode covers a spectrum:
+
+- **Fresh scoping.** `astra.yaml` was scaffolded by `/lc-new` (or by hand); decisions and outputs are sketched but the analysis hasn't run. Narrative drafted against intent, not results.
+- **Live in-flight research.** Work is happening; data is coming in, decisions are settling, results are landing. Spec moves between conversations, narrative moves with it.
+- **Newly-stable analysis.** Work has finished or paused; the user wants to write a narrative for what they did. No paper, no fibers — they remember it, and that memory is the source.
+
+Pure greenfield (no `astra.yaml` at all) isn't a coherent narrative-skill task — there's nothing to cite into. If a user is at that stage, route them to `/lc-new` to scaffold structure first.
+
+## What's distinct from paper reproduction
+
+- **Source is conversation, not prose.** The paper-reproduction harvest move (paraphrase from a written source) doesn't apply. Draft moves come from dialogue with the user — `AskUserQuestion` when several framing questions land together, prose follow-ups when one question opens the next.
+- **Voice depends on stage.** Reproduction is always declarative ("The pipeline runs in…"). Co-drafting voice tracks where the work is: present-tense for live work, past tense for completed steps, provisional markers when content is volatile.
+- **Spec and narrative move together.** In reproduction the spec is fixed (or close to it) and the narrative reconstructs the paper. In co-drafting the spec may shift between drafts; expect to revisit narrative when a decision lands or a sub-analysis splits.
+
+## The ask-first discipline
+
+Co-drafting is the one mode where authoring without asking produces fiction. The user is available; ask. Surface the load-bearing reads before drafting — `AskUserQuestion` when several land together, single questions or prose follow-ups when the conversation wants its own rhythm:
+
+- **Research question.** What are you trying to learn? One sentence.
+- **Current headline finding** (if any). What's been established so far? One sentence; a gesture is fine.
+- **Movement so far.** What pivots, abandoned options, surprises belong in the record?
+- **Implications.** What would you claim today about what this means? Premature strong claims aren't required; honest gestures are.
+
+The user's framing is the substrate. Don't draft around a guess at it.
+
+## Provisional voice
+
+When content is moving, make incompleteness visible. Three moves:
+
+**Phrasing carries confidence.** Not "we constrain X to 3%"; rather "our current best constraint on X is 3%, pending validation of the covariance in [reconstruction](#analyses.reconstruction)." Hedge what's uncertain; claim what's settled.
+
+**Explicit markers.** At the top of `summary` (or any volatile key), an italic note:
+
+```yaml
+summary: |
+  _(Provisional — revisit after bao_fitting. Last updated 2026-04-23.)_
+  We are measuring the BAO scale...
+```
+
+The `_(Provisional)_` prefix is a convention, not a spec field. It reads as expected-to-change without breaking the narrative shape.
+
+**Decision rationales can be open.** "We are currently running with option X, pending validation of Y. See [[fiber-or-sub-analysis]]." A `rationale:` doesn't have to be retrospective.
+
+When work stabilizes (a paper draft lands, results publish), revise into reproduction voice — past tense, declarative, scope clear. Co-drafting was scaffolding; the final narrative reads as a stable artifact.
+
+## Open questions before this is production-ready
+
+- **Provisional markers — convention or schema?** Today they're prose conventions (`_(Provisional)_`); whether they belong as structured metadata is open.
+- **What's a `tempered`-style flag for narrative?** `tempered: true` on fibers signals "solid enough to build on." A narrative-level analog could let renderers display freshness state.
+- **Anchor coverage for elements that don't exist yet.** "Once [reconstruction](#analyses.reconstruction) is run, we expect X." The validator currently requires anchors to resolve — co-drafting may need a "planned" sub-analysis form, or the prose may need to avoid forward-anchoring entirely.
+- **Boundary with `/lc-new`.** `/lc-new` does conversational scoping but defers narrative ("filled in later, once structural pieces have settled"). When does the user finish `/lc-new` and switch to `/narrative` for the prose pass? Unclear today.
+- **Boundary with retrofit.** A user co-drafting a narrative for completed work is reaching for the same artifacts retrofit mines. The line between "harvest from your own memory" (co-drafting) and "harvest from artifacts you produced" (retrofit) is fuzzy when the user is the artifacts' author.
+
+## Pointers when authoring today
+
+The substrate from SKILL.md applies in full: five keys, length cap, anchor grammar, reserved IDs, data flow, validation, craft. What changes is the *source* of content (dialogue) and the *voice* (provisional where moving).
+
+- Use first-person plural and present tense for live work; past tense for completed steps.
+- Hedge when uncertain; claim when confident. Over-hedging is its own failure mode.
+- Mark sub-analyses that don't exist yet with provisional language rather than fake anchors.
+- Inverted draft order can help: write `summary` first as a stub (to fix intent), then draft the rest, then return to `summary` last to revise. This is the opposite of reproduction's compress-last because the substrate is moving.
+
+## Anti-patterns (co-drafting-specific)
+
+- **Solo drafting.** The user is available; ask before guessing motivation, headline finding, or implications.
+- **False completeness.** Writing in reproduction voice ("we measure," "we constrain") when the measurement is in flight. Use "we are measuring" / "our current constraint is X, pending Y."
+- **Provisional everywhere.** If every sentence is hedged, the narrative reads as afraid of itself. Hedge the genuinely uncertain claims; state the settled ones plainly.
+- **Stale markers.** A "revisit after X" comment left in place after X has landed is worse than no marker at all. Revise on each touch.
+- **Over-committing to implications.** Promising what results will mean before they land. A gesture is honest; a claim before evidence is not.
+
+## Report friction
+
+If you hit co-drafting cases this stub doesn't cover, file a fiber or GitHub issue against `lightcone-cli` with `narrative` in the title so the next pass can firm this up.
diff --git a/claude/lightcone/skills/narrative/references/existing-analysis.md b/claude/lightcone/skills/narrative/references/existing-analysis.md
new file mode 100644
index 00000000..a28cba29
--- /dev/null
+++ b/claude/lightcone/skills/narrative/references/existing-analysis.md
@@ -0,0 +1,50 @@
+# Existing-analysis retrofit mode (stub)
+
+> **Status: under development.** Use paper reproduction (the default flow when a paper exists) when applicable. This file names what's distinct about retrofit and the open questions; it isn't yet production guidance.
+
+A project has been running — code, results, partial spec, no published paper — and is being imported into ASTRA. The `astra.yaml` has been built (or is being built); the narrative is reconstructed from the artifacts that produced the work, not from a written source. Retrofit is **harvest from artifacts**; co-drafting is **harvest from conversation**; reproduction is **harvest from a paper**.
+
+## What's distinct from paper reproduction
+
+- **No source narrative.** The five-key shape has to be assembled from
+  what the artifacts carry: README, CLAUDE.md, fibers, notebook cells,
+  code comments, commit messages, meeting notes, old proposals, issues,
+  closed PRs.
+- **Triage comes first.** Sub-analyses and decisions classify as live /
+  superseded / abandoned / unclear. The narrative speaks for live content
+  by default; abandoned and superseded only appear if the user wants
+  history surfaced.
+- **Gaps are explicit.** When a decision's original rationale isn't
+  recoverable from artifacts and the user can't reconstruct it, the
+  honest move is to say so — `_(Reconstructed YYYY-MM: original rationale
+  not recorded.)_` — not to fabricate a plausible justification.
+- **Past tense for what happened.** Present tense only for living
+  structure ("the pipeline runs three stages").
+
+## Open questions before this is production-ready
+
+- **What's the canonical artifact harvest?** README, fibers, notebooks,
+  commits, PR threads — order, depth, when to stop. Real retrofit cases
+  will vary widely; the skill needs a default ordering and the criteria
+  for going deeper.
+- **How aggressive is `AskUserQuestion`?** A retrofit on a year-old
+  project may have a researcher who remembers some decisions but not
+  others. Where's the line between asking and reconstructing?
+- **History sections.** When abandoned options are load-bearing
+  ("we tried X for six months, switched to Y"), they belong in
+  movement-of-learning. Routing: new sub-analysis with `excluded:` /
+  `lifecycle: abandoned`? Inline marker in `methods`? No firm answer.
+- **Voice for reconstructed content.** `_(Reconstructed)_` works
+  inline. Whether reconstructed-vs-original needs structural distinction
+  in the spec, or stays a prose convention, is open.
+
+## When retrofit shifts modes
+
+- **Becomes reproduction.** If the project is reproducing an unacknowledged paper, switch to the default flow for the parts that map. Hybrid is fine.
+- **Becomes co-drafting.** If retrofit surfaces that core decisions are still open and the user wants to revisit them now, switch to co-drafting mode for those sections (provisional voice, revisit after decisions land).
+
+## Report friction
+
+If you hit retrofit cases this stub doesn't cover, file a fiber or
+GitHub issue against `lightcone-cli` with `narrative` in the title so
+the next pass can firm this up.
diff --git a/claude/lightcone/skills/narrative/references/paper-reproduction.md b/claude/lightcone/skills/narrative/references/paper-reproduction.md
new file mode 100644
index 00000000..a5fa9d1f
--- /dev/null
+++ b/claude/lightcone/skills/narrative/references/paper-reproduction.md
@@ -0,0 +1,118 @@
+# Paper reproduction mode
+
+An authoritative text source exists — most often a published paper, but also a thesis, technical report, posted preprint, or other canonical account of the work. Reconstruct its narrative into ASTRA's five-key shape, drawing on **the text and the under-construction `astra.yaml` as paired sources**: the text carries the claims and the confidence register; the spec carries the structural decomposition (which decisions are nodes, which findings are nodes, where sub-analyses sit). Neither is sufficient alone.
+
+The spec may be stable, in flux, or both — paper-reproduction often runs concurrently with spec refinement. The narrative tracks both: when a decision is added, write its `rationale:`; when a sub-analysis splits, draft its five keys; when a finding is declared, fold it into the parent's `findings` synthesis.
+
+Read the main SKILL.md first. This file adds what's specific to reproduction.
+
+## Where the source text lives
+
+The skill expects `work/reference/` to exist — the standardized output of [`/paper-extraction`](../../paper-extraction/SKILL.md). If it doesn't, run `/paper-extraction` first. The predictable shape:
+
+- `work/reference/paper.tex` (Path A — symlink to main `.tex`) **or** `work/reference/document.md` (Path B — Docling output)
+- `work/reference/index.json` — section outline with line numbers, figures, tables, citation locations
+- `work/reference/astra.yaml` — the paper as an ASTRA artifact (claimed findings as ASTRA findings)
+- `work/reference/figures/`, `work/reference/tables/`, `work/reference/source/` (Path A only)
+
+If no authoritative text is accessible at all, this isn't reproduction — fall back to `references/existing-analysis.md` or `references/co-drafting.md`.
+
+## Paper-to-ASTRA mapping
+
+Write this down before drafting a sentence.
+
+| Paper element | ASTRA home |
+|---|---|
+| Abstract | `summary` |
+| Introduction (motivation, related work) | `summary` + `findings` intro |
+| Methods section N | the matching sub-analysis's `narrative.methods` |
+| Results | structural `findings.<id>` claims; narrative intro in `findings` |
+| Discussion | `findings` narrative + `summary` implications |
+| Conclusions | reinforces `summary` |
+| Figures / tables | `outputs.<id>` — referenced in `findings` via anchors |
+| "We chose X because Y" sentences | the relevant decision's `rationale:` |
+
+Not every text maps cleanly section-to-sub-analysis. When it doesn't, the sub-analysis DAG in `astra.yaml` is authoritative: narrate according to the DAG, harvesting the source text's prose for content. If the spec deliberately reorganized relative to the text, say so briefly in `methods`.
+
+## Workflow
+
+### 1 · Orient
+
+Read both sources before drafting. The spec carries the structural decomposition; the text carries the claims.
+
+1. **`astra.yaml` at the project root** — whole file. Note `inputs`, `outputs`, `decisions`, `findings`, `analyses`, existing `narrative:`. Notice which of the five keys are present vs. empty.
+2. **Each sub-analysis `astra.yaml`** — skim decisions (inherited vs. local), findings, outputs, existing narrative.
+3. **The source text** — abstract, intro open/close, methods section headers, discussion, conclusions. Read full sections when drafting the corresponding ASTRA piece. Use `work/reference/index.json` to navigate; the parsed `paper.tex` (Path A) or `document.md` (Path B) is the primary source.
+4. **Project `CLAUDE.md` and any working notes** — paper-specific conventions, gotchas, scope decisions.
+
+If the user is present, surface the orienting questions — `AskUserQuestion` is useful when several land together; one question at a time is fine when only one is open:
+
+- **Scale:** top-level, a specific sub-analysis, or a decision's `rationale:`?
+- **Pure reproduction, or with reproducer extensions** (e.g., the reproduction's covariance differs from the posted table)?
+- **Approach:** start with a specific question first — a methods subsection, a particular figure's choices, a discussion claim worth tracing into the decisions — or one-shot the whole narrative? Sets the session shape.
+
+### 2 · Draft order
+
+Not `summary` first. `summary` compresses the rest; draft it last.
+
+1. **`inputs`** — shortest. Name the data and its provenance. One short paragraph. Let the inputs structure carry the dataset detail.
+2. **`methods`** — walk the pipeline in DAG order. Cite each sub-analysis and decision by anchor as part of the argument, not as an enumeration. If too many to weave coherently, the analysis wants more sub-analyses. Inheritance that propagates across sub-analyses gets called out explicitly because it's load-bearing end-to-end. A pivot the paper narrates ("we initially tried X, but…") is cheap to preserve because of telescoping.
+3. **`findings`** — only if findings are declared structurally. Synthesize how they relate; each cited by anchor, not enumerated.
+4. **`outputs`** — thin. Which artifacts were promoted and why; cite the sub-analysis that produced them; name downstream consumers (see Data flow in SKILL.md).
+5. **`summary`** — last. 1–2 paragraphs. Open with the question and the headline finding; thread motivation, method, and implications. No primer material.
+
+For each decision, write a one-paragraph `rationale:`: what was decided, the prior insight that motivated it (cite by anchor), what the load-bearing alternative was and why it lost.
+
+For sub-analyses, same order, same length target.
+
+**Conditional keys.** Only include keys whose structural counterpart is non-empty. A reconstruction sub-analysis with no findings gets `summary`, `methods`, `inputs`, `outputs` — no `findings`.
+
+### 3 · Reproduction-specific moves
+
+- **Tell the author's story by default.** The narrative reproduces what the paper says, restated within the ASTRA structure — anchored to what's referable in the spec (decisions, findings, prior insights). Decision rationales come from the paper's "we chose X because Y" sentences, not invented post-hoc.
+- **Paraphrase, don't lift.** Restate the paper's claims in your own structuring rather than copying sentences verbatim — verbatim quotation calls authorship into question. Preserve meaning and confidence register; don't sharpen or soften (if the paper says "we detect," don't write "we strongly detect"; if it hedges, preserve the hedge).
+- **Two sources, paired.** The authoritative text carries claims, confidence register, and sequence. The under-construction `astra.yaml` carries the structural decomposition. Draft against both; let the spec's structure shape what each key covers, and let the text shape what's said.
+- **When the reproduction's results differ, adapt — and flag.** Where the reproduction landed on different findings (a covariance that diverges from the posted table, a coefficient with different precision, a null where the paper claimed detection), the narrative needs to report what was actually found, not what was claimed. This wants human input on phrasing; surface the divergence to the user rather than papering over it.
+- **Voice seams.** When reproducer-specific content enters the narrative, mark the transition. *"During reproduction we found the published covariance differs from the posted table"* is a seam; the sentence before it can speak in the paper's voice, the sentences after it speak in the reproducer's. A sentence that silently mixes them confuses both.
+- **Walk the paper's sequence in `methods`.** Traverse sub-analyses in DAG order — and the DAG order should match the paper's section order. If the spec deliberately reorganized (split one section into two sub-analyses, or merged two sections into one), name the deviation briefly in `methods`. Don't reorder silently.
+- **Published = done.** Reproduction narrative is declarative, present-tense matching the paper's voice ("The analysis is organised as…", "The pipeline runs in…"). Not "we are measuring."
+- **Scope-limited reproductions.** Real-world reproductions often cover a subset of the paper (e.g., DESI BAO reproducing only LRG1+LRG2). Name the scope in `summary` so a reader knows what's in and out.
+
+### 4 · Critique pass
+
+Run all four audits before declaring the narrative done.
+
+**Fidelity audit.**
+
+- Claims match the paper, **except where reproduction results actually differ.** If the reproduction landed on different findings, the narrative reports what was found — and the divergence has been surfaced to the user for phrasing input, not silently softened or sharpened.
+- Voice seams marked where reproducer content enters.
+- Rationales traceable to the paper's justifications or to a prior insight in the spec.
+- No invented citations. Every anchor resolves to a real spec id.
+- Scope (what's reproduced, what isn't) stated in `summary` if narrower than the paper.
+
+**Sequence audit.**
+
+- `methods` walks sub-analyses in DAG order; DAG order matches the paper's narrative sequence (or the deviation is named in prose).
+- `summary` opens with the question, not a field primer.
+
+**Anchor coverage audit.**
+
+- `astra validate` warns on any declared finding / decision / output / sub-analysis not cited in the narrative. Review the warnings; either cite the element or consider whether it should be declared.
+
+**Structural-peer-redundancy audit.**
+
+- Citations woven into argument, not recited as a list.
+- `findings` narrative synthesizes relationships between findings; `inputs` narrative names provenance. Neither catalogs fields.
+
+## Anti-patterns (reproduction-specific)
+
+- **Lifting verbatim.** Copy-pasting abstract sentences into `summary`. Paraphrase — otherwise the narrative reads as a citation of itself.
+- **Adding implications the paper didn't make.** Fidelity cuts both ways.
+- **Eliding the reproducer's voice entirely.** If the reproduction caught something the paper missed, name it with the seam.
+- **Treating paper sections as sub-analyses.** A paper's Section 3.2 isn't automatically a sub-analysis; the DAG is the authority.
+- **Listing instead of weaving.** Narrate each decision where it shapes the pipeline. Too many to weave coherently → the spec wants more sub-analyses.
+
+## When reproduction shifts modes
+
+- **Hybrid with co-drafting.** If the reproduction adds a sub-analysis the paper didn't have (a reproducer-specific extension), that sub-analysis's narrative is co-drafted, not reproduced. Use the seams.
+- **Hybrid with retrofit.** If the reproduction inherits code or fibers from a prior iteration, those carry rationale that didn't make it into the paper — harvest from artifacts as in retrofit mode for those sections.
diff --git a/claude/lightcone/skills/paper-extraction/SKILL.md b/claude/lightcone/skills/paper-extraction/SKILL.md
new file mode 100644
index 00000000..dd4dce32
--- /dev/null
+++ b/claude/lightcone/skills/paper-extraction/SKILL.md
@@ -0,0 +1,242 @@
+---
+name: paper-extraction
+description: >
+  Turn an arXiv ID or DOI into a standardized `work/reference/` directory:
+  paper substrate (arXiv LaTeX source primary, PDF + Docling fallback),
+  copied figure files, per-table `.tex` files, section outline with line
+  numbers, deduplicated citation keys with every location they appear plus
+  each cited paper's full citation text and resolved DOI, abstract,
+  embedded bibliography (when present in source), and a valid
+  `astra.yaml` representing the paper as an ASTRA artifact (with the
+  paper's claimed numerical findings as ASTRA `findings:`). Emits a
+  top-level `index.json` for the structural surface plus the `astra.yaml`
+  for the semantic surface. Triggers on: "read paper", "prep paper",
+  "ingest paper", "extract paper", "set up paper", "fetch arxiv", "arxiv
+  id", "DOI", "find paper", or `/paper-extraction <id>`.
+---
+
+# paper-extraction
+
+Turn a DOI or arXiv ID into a standardized, indexed `work/reference/` directory. One entry-point, idempotent, self-contained.
+
+The output is a predictable surface anyone can rely on without re-parsing LaTeX. What a consumer does with that surface is their concern — paper-extraction's job ends at the index.
+
+## When to use
+
+- "Read [paper] end-to-end" / "I want to verify a claim in [paper]" — full source plus structured artifacts so you're reading the actual paper, not a flattened PDF
+- "Set up reading materials for [paper]" — when the next thing you'll do involves browsing figures, citations, or section structure and you don't want to grep the tarball every time
+- Any workflow where another skill or process needs a known directory shape per paper
+
+## Outputs
+
+Under `work/reference/` (idempotent — skips work already done):
+
+```
+work/reference/
+├── index.json                # structural index — figures, tables, outline, citations (with DOIs), paths
+├── astra.yaml                # ASTRA-shape representation: the paper as an ASTRA artifact, including findings
+├── paper.pdf                 # always
+├── paper.tex                 # Path A — symlink to the main .tex file
+│   (or)
+├── document.md               # Path B — Docling-extracted markdown
+├── source/                   # Path A — extracted arXiv tarball (full source tree)
+├── figures/                  # figure files (copied from LaTeX or rendered by Docling)
+├── tables/                   # one .tex file per `\begin{table}` block (Path A)
+├── bibliography-source.bib   # Path A only — copy of any .bib found in source/
+├── bibliography-source.bbl   # Path A only — copy of any .bbl found in source/
+└── .doi-cache.json           # Crossref/ADS lookup cache for re-run idempotency
+```
+
+The skill produces only the paper's own reading materials. Anything not contained in or derived from the paper itself — code repositories, supplementary datasets, related papers — is out of scope; the caller handles those.
+
+### Two surfaces: `index.json` (structural) and `astra.yaml` (semantic)
+
+**`index.json` is structural and machine-friendly.** Everything the script could mechanically extract: figures, tables, section outline with line numbers, citation keys with every location *plus the cited paper's full citation text and resolved DOI*, abstract, paths. Read this when you want to know "what's in this paper, where do I find it." Sample shape:
+
+```json
+{
+  "schema_version": 1,
+  "path": "A",                                  // or "B"
+  "paper_pdf": "paper.pdf",
+  "paper_tex": "paper.tex",                     // null on Path B
+  "source_dir": "source",                       // null on Path B
+  "document_md": null,                          // "document.md" on Path B
+  "bibliography_source_bib": "bibliography-source.bib",
+  "bibliography_source_bbl": null,
+  "astra_yaml": "astra.yaml",
+  "title": "UNIONS-3500 Weak Lensing: B-mode validation",
+  "abstract": "At Stage-III sensitivities, cosmic shear B modes ...",
+  "figures": [
+    {"id": "fig1", "label": "fig:bao", "caption": "...", "source_path": "fig_bao",
+     "file": "figures/fig_bao.pdf", "block_origin": "main.tex", "line": 412}
+  ],
+  "tables": [
+    {"id": "tab1", "label": "tab:cosmo", "caption": "...", "file": "tables/tab-cosmo.tex",
+     "block_origin": "main.tex", "line": 487}
+  ],
+  "outline": [
+    {"level": 1, "title": "Introduction", "label": "sec:intro", "source_file": "main.tex", "line": 157}
+  ],
+  "citations": {
+    "asgari17": {
+      "locations": [{"file": "main.tex", "line": 178}, {"file": "main.tex", "line": 561}],
+      "citation": "Asgari, M., et al. (2017) KiDS-450: Tomographic cross-correlation cosmic shear results. MNRAS 464, 1676-1692",
+      "doi": "10.1093/mnras/stw2606"
+    },
+    "planck18_lensing": {
+      "locations": [{"file": "main.tex", "line": 92}],
+      "citation": "Planck Collaboration, et al. (2020) Planck 2018 results. VIII. Gravitational lensing. A&A 641, A8",
+      "doi": "10.1051/0004-6361/201833886"
+    }
+  },
+  "extraction_warnings": [
+    "figure fig3: \\includegraphics{...} could not resolve to a file in source/",
+    "citation kuijken:2011: could not resolve DOI; tried doi-field, eprint-field, Crossref, ADS"
+  ]
+}
+```
+
+The `citations:` block maps each cited paper's BibTeX key (Path A) or synthetic `<lastname>_<year>` key (Path B) to `{locations, citation, doi}`. Downstream consumers (e.g. lc-from-paper's SPECIFY when authoring `prior_insights:` placeholders, LITERATURE when discovering which DOIs to fetch) read the DOI directly from `citations[key].doi`. Unresolvable entries keep `citation: null` and/or `doi: null` and are flagged in `extraction_warnings`.
+
+**`astra.yaml` is semantic and ASTRA-validating.** Treats the paper as an ASTRA artifact: `id`, `version`, `name`, `narrative.summary`, and `findings:` carrying the paper's claimed numerical results in ASTRA's Insight + Evidence shape. Read this when you want to know "what does this paper claim, with quote evidence anchored to the source." The script writes a stub (id, version, name, narrative.summary from abstract, empty findings); Step 5 fills in `findings:`.
+
+Why both: the structural index is queryable by any consumer (`grep`, `jq`, agent code) without needing to know about ASTRA. The ASTRA file composes directly into reproductions, MySTRA, and any other ASTRA-aware tool — and the verbosity of the Insight + Evidence shape *is* the back-pressure against hallucinated numerical claims (the agent has to find and quote the actual text).
+
+## Workflow
+
+### Step 1 — Survey
+
+Always start with `ls work/reference/` and read `index.json` if present. Skip the work that's already done:
+
+| File present | Step to skip |
+|---|---|
+| `source/` (Path A) or `document.md` (Path B) + `paper.pdf` | Substrate acquired (Step 2) |
+| `index.json` with non-empty figures/tables/outline | Structural extraction done (Step 3) |
+| `astra.yaml` exists | Stub written; never overwritten on re-run (preserves agent edits) |
+| `astra.yaml` has non-empty `findings:` and `narrative.findings:` populated | Findings step done (Step 5, optional) |
+
+If nothing is present, run the full workflow.
+
+### Step 2 — Acquire substrate
+
+Pick the path on entry from the input form:
+
+- **arXiv ID** (e.g. `2503.19441`) → **Path A** (LaTeX source primary)
+- **DOI** for an arXiv paper (e.g. `10.48550/arXiv.2503.19441`) → Path A (resolve to arXiv ID first)
+- **Journal DOI** without arXiv preprint → **Path B** (PDF + Docling fallback)
+
+Read [`references/arxiv-source.md`](references/arxiv-source.md) for Path A; [`references/pdf-fallback.md`](references/pdf-fallback.md) for Path B. Both end with `work/reference/paper.pdf` and a structured-text representation under `work/reference/`.
+
+### Step 3 — Run the extraction script
+
+`scripts/extract-paper-substrate.py` does the deterministic structural pass and writes the `astra.yaml` stub:
+
+```bash
+python3 .claude/skills/paper-extraction/scripts/extract-paper-substrate.py \
+  --arxiv-id <arxiv-id>   # or --doi <doi>
+```
+
+The script detects the path automatically and produces:
+
+- `figures/` populated with copied figure files (Path A) or untouched (Path B — Docling already populated it)
+- `tables/<label-slug>.tex` — one file per `\begin{table}` block (Path A only)
+- `bibliography-source.{bib,bbl}` if present in the source tarball (Path A only)
+- `index.json` — the unified structural index, including the enriched `citations:` block (each cited key carries `{locations, citation, doi}`; DOI resolution covers ~96% of typical-paper bibliographies)
+- `astra.yaml` — stub ASTRA representation: id, version, name (from `\title{}`), narrative.summary (from abstract), empty `findings: {}` for Step 5
+- `.doi-cache.json` — Crossref/ADS lookup cache; re-runs skip the network for already-seen entries
+
+The `--arxiv-id` / `--doi` argument populates the `id` and the evidence `doi:` field in `astra.yaml`. If neither is provided, the script writes placeholder text the agent can fix.
+
+The DOI resolver tries, in order: the entry's `doi:` field → an `eprint:`-derived arXiv DOI → Crossref bibliographic query (free, no API key needed) → ADS title search (only if `ADS_API_TOKEN` env var or `~/.ads/dev_key` is present — graceful skip when absent). Title hits from Crossref are gated by a similarity check against the queried title to drop noisy false matches.
+
+### Step 4 — Review the script's output and fix structural gaps
+
+The script is purely deterministic. It walks the structural surface but does not understand the paper. Read `index.json`'s `extraction_warnings` and address each:
+
+- **`figure figN: \includegraphics{X} could not resolve`** — the LaTeX referenced a file the script couldn't find. Search the source tree manually (sometimes figures live in non-standard subdirectories with non-standard extensions); copy the file into `figures/` and update the corresponding `index.json` entry's `file` so it's no longer null.
+- **`figure figN: no \caption found`** — composite figures (subfloats) sometimes lack a top-level caption; verify the figure block in source and either record the per-subfigure captions in `caption` or note that the figure is composite.
+- **`table tabN: no \label`** — verify the table is intentional (some `\begin{table}` blocks are non-tabular layout); rename or annotate as needed.
+- **`citation <key>: could not resolve DOI`** — the entry has no `doi:` / `eprint:` field, and neither Crossref nor ADS (when available) returned a match. The entry stays in `citations:` with `doi: null`; a downstream consumer can flag it for human resolution or skip it. If many entries are unresolved, check that the title field is clean (sometimes `.bib` titles carry uncleaned LaTeX commands that drag down the Crossref similarity gate). Delete `.doi-cache.json` to force re-resolution.
+- **`citation <key>: cited in source but no matching entry in bibliography-source.{bib,bbl}`** — a `\cite{<key>}` invocation has no corresponding bib record. Usually a typo in the LaTeX source; flag it and move on. The entry stays in `citations:` with `citation: null, doi: null`, locations preserved.
+- **Path B caveat** — outline extraction is not yet implemented for the Docling fallback. Bibliography resolution works on Path B by parsing the references section at the tail of `document.md` and synthesizing keys (`<lastname>_<year>`), but citation *invocations* from rendered prose aren't yet extracted — Path B citations carry empty `locations: []`. The warnings list flags this.
+
+Also eyeball `astra.yaml`'s `name:` and `narrative.summary:`. The title or abstract may contain unresolved custom `\newcommand` macros (defined elsewhere in the source); the script doesn't expand macros, so they pass through verbatim. Clean them up if you need pretty rendering downstream — none of this blocks validation.
+
+### Step 5 — *(Optional)* Walk the paper for findings, append to `astra.yaml`
+
+**Skip unless a downstream consumer needs `findings:` populated.** Steps 1–4 produce a complete `work/reference/` and a valid (empty-findings) `astra.yaml` on their own. Reproductions and diff workflows need findings; reading and browsing don't.
+
+When you do run Step 5: for each **central numerical claim the paper makes about its results** — headline measurements, structural conclusions ("we detect X at Y σ"), validated null-test outcomes — append a finding to `astra.yaml`'s `findings:` map. *Not* methodology choices or dataset descriptions; those live elsewhere. Shape (per ASTRA's [Insight + Evidence](https://w3id.org/ASTRA/insight) classes):
+
+```yaml
+findings:
+  s8_constraint:
+    id: s8_constraint
+    claim: "S_8 = sigma_8 (Omega_m / 0.3)^0.5 = 0.795 ± 0.014 from the fiducial pure E/B analysis"
+    created_at: "2026-04-04T00:00:00Z"
+    evidence:
+      - id: abstract_quote
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "we find $S_8 = 0.795 \\pm 0.014$"
+```
+
+When `findings:` is non-empty, `narrative.findings:` must reference at least one finding — e.g. `narrative: { findings: "The fiducial analysis yields the [S_8 constraint](#findings.s8_constraint)." }`.
+
+See `examples/unions-bmodes-astra.yaml` for a fully populated `astra.yaml` (six findings, narrative, evidence anchored to the published version).
+
+**Discipline:**
+
+- **Read the abstract and conclusions first.** Most central findings can be quoted from one of those two surfaces.
+- **`quote.exact` is verbatim.** Copy LaTeX as it appears in `paper.tex` — don't paraphrase, don't expand macros, don't normalize math. `astra validate --verify-evidence` searches for this string in the cached PDF; paraphrasing breaks the gate. If the quote isn't unique, add `prefix:` / `suffix:` (~20–100 chars) per W3C TextQuoteSelector.
+- **Every evidence carries `doi:`** (the paper's own DOI, e.g. `10.48550/arXiv.2604.03227`) and `version:` (the arXiv version: `1` for v1, `2` for v2).
+- **Validate.** `astra validate work/reference/astra.yaml` confirms shape; `--verify-evidence` confirms each `quote.exact` is actually findable in the cached PDF.
+
+
+## Inputs
+
+The skill accepts:
+
+1. An **arXiv ID** (`YYMM.NNNNN` or pre-2007 form like `astro-ph/0607021`)
+2. A **DOI** — either an arXiv DOI (`10.48550/arXiv.<id>`) or a journal DOI
+
+The slash-command form is `/paper-extraction <arxiv-id-or-doi>`.
+
+## What the script does vs what the agent does
+
+**Script (`extract-paper-substrate.py`):** walks LaTeX (Path A) or Docling output (Path B) and emits two things:
+
+1. `index.json` — figures (with copied files + line numbers + multi-graphic panels), tables (one `.tex` per block, including AAS `deluxetable`), section outline (with line numbers, in paper-reading order), citation keys (with every file+line they appear on, including biblatex commands, *plus the cited paper's full citation text and resolved DOI*), abstract, title, paths.
+2. `astra.yaml` — a stub ASTRA artifact: `id` (derived from arxiv-id/DOI), `version`, `name` (from `\title{}`), `narrative.summary` (from abstract), empty `inputs:`/`outputs:`/`findings:`. Validates as-is.
+
+The script handles a few realities of LaTeX papers automatically:
+
+- **Comments are stripped** before regex passes, so commented-out `\includegraphics` / `\cite` / `\section` don't leak into extraction. Newlines are preserved so line numbers stay accurate.
+- **Multi-file source** (`\input{}` / `\include{}` chains) is read in **paper-reading order** by walking `main.tex`'s input tree, not alphabetical filename order.
+- **Simple `\newcommand{\name}{body}` macros** are expanded in extracted titles, abstracts, captions, and section names. Macros with arguments (`\newcommand{\foo}[1]{...}`) pass through unexpanded — handling those would require evaluating arbitrary LaTeX.
+- **Standard table envs** (`table`, `table*`, `deluxetable`, `deluxetable*`) and **standard citation commands** (natbib family + biblatex `\autocite` / `\textcite` / `\parencite` / `\footcite` / `\smartcite`) are all recognized.
+- **Bibliography parsed in-script.** `.bib` files (preferred — `@type{key, field = value}` entries with brace-protected lastnames recognized) and `.bbl` files (rendered `\bibitem{key}` blocks) are parsed for Path A; the references section at the tail of `document.md` is parsed for Path B (synthesizing `<lastname>_<year>` keys with letter-suffix disambiguation). DOIs are resolved against Crossref + (optionally) ADS, cached for idempotency, and joined back against `\cite{}`-extracted locations.
+
+What the script does *not* do: understand what figures show, identify findings, infer methodology, or handle substrate acquisition (Step 2). It also doesn't expand macros with arguments, resolve `\graphicspath{}` overrides, parse non-LaTeX abstract metadata blocks, or extract citation invocations from rendered prose (Path B `locations:` arrays are empty as a result).
+
+**Agent (Steps 4 + 5):** reads `index.json`'s `extraction_warnings` and fixes structural gaps (Step 4), then walks the paper and writes `findings:` into `astra.yaml` with quote-anchored evidence (Step 5). The verbosity of the Insight + Evidence shape *is* the back-pressure: the agent has to find and quote actual paper text, not invent.
+
+## Discipline
+
+- **One entry-point.** `/paper-extraction <id>` is the whole surface. Don't have callers reach into `scripts/` or `references/` directly. The skill orchestrates; consumers trust `index.json`.
+- **Self-contained.** This skill takes a DOI and produces a standardized directory. It doesn't know who calls it or what they do with the result. Don't add caller-specific logic.
+- **Idempotent.** Survey-first, skip-if-done. Re-invoking on the same paper does no work and produces no errors. DOI lookups cache to `.doi-cache.json`; re-runs don't re-hit the network for already-seen entries.
+- **arXiv-LaTeX is primary.** When an arXiv source tarball is acquirable, Path A wins. PDF + Docling is the fallback for non-arXiv only.
+- **Reading materials only.** The skill produces what's structurally in the paper itself — substrate, figures, tables, outline, citations (with resolved DOIs), embedded bibliography. Adjacent assets (code repos, supplementary datasets, related papers, project bibliography *management* — i.e. authoring new entries, curating across papers) are explicitly out of scope; *resolving* the bibliography that's already in the paper is in scope.
+- **Script is dumb on purpose.** The deterministic pieces (figure/table blocks, section headings, `\cite{}` keys, bibliography entries, DOI lookups) belong to the script. Anything that requires understanding what the paper is *about* lives outside this skill — paper-extraction sets the table; it doesn't read the meal.
+- **`extraction_warnings` is the agent surface.** When the script can't resolve something (unmatched citation key, unresolvable DOI, network failure), it doesn't fail or guess — it warns. The agent reads the warnings and decides whether to fix or surface.
+
+## Anti-patterns
+
+- **Re-fetching what's already there.** Always survey `work/reference/` and read `index.json` first.
+- **Adding numerical-finding extraction to the script.** Macro-based extraction (`\newcommand{\Omegam}{0.315}`) catches almost no real papers; inline-value extraction needs semantic judgment about what's a *result* vs incidental. Findings live in `astra.yaml`, written by the agent in Step 5.
+- **Paraphrasing the `quote.exact` text.** Copy the paper's LaTeX text verbatim. Paraphrasing breaks `astra validate --verify-evidence` and weakens the back-pressure that justified ASTRA shape in the first place.
+- **Producing a parallel cited-papers artifact.** Bibliography resolution lives inside `index.json`'s `citations:` block, not in a side file. Anyone who needs the citation→DOI mapping reads `index.json#citations[key].doi` directly.
+- **Surfacing partial state silently.** If `paper.pdf` was fetched but the LaTeX-source download failed, write `work/reference/extraction-error.txt` with a clear cause and stop, rather than producing a half-populated `work/reference/` with no signal that more was intended.
+- **Knowing about the caller.** The skill's contract is the directory + index. If you're tempted to write logic that depends on a particular invoker, push that logic into the invoker instead.
diff --git a/claude/lightcone/skills/paper-extraction/examples/unions-bmodes-astra.yaml b/claude/lightcone/skills/paper-extraction/examples/unions-bmodes-astra.yaml
new file mode 100644
index 00000000..8fa35ceb
--- /dev/null
+++ b/claude/lightcone/skills/paper-extraction/examples/unions-bmodes-astra.yaml
@@ -0,0 +1,106 @@
+# Worked Step 5 example for paper-extraction.
+#
+# Generated from arXiv:2604.03227, then filled with 6 quote-anchored
+# findings. Verified with:
+#
+#   astra paper add 10.48550/arXiv.2604.03227 --version 1 --pdf paper.pdf
+#   astra validate astra.yaml --verify-evidence
+
+id: arxiv_2604_03227
+version: "0.0.7"
+name: "UNIONS-3500 Weak Lensing: II. B-mode validation for cosmic shear"
+
+narrative:
+  summary: |
+    At Stage-III sensitivities, cosmic shear $B$ modes unambiguously indicate systematic contamination and are often used to inform data selection and scale cuts for cosmological inference.
+    We validate $B$ modes for the Ultraviolet Near-Infrared Optical Northern Survey (UNIONS)-3500 (\SI{2894}{\square\deg}, $n_\mathrm{eff} \approx \SI{5.0}{arcmin\tothe{-2}}$) using three $E$/$B$-separable statistics: pure-mode correlation functions $\xi_\pm^{\mathrm{B}}(\theta)$, Complete Orthogonal Sets of $E$/$B$-mode Integrals (COSEBI) $B$-mode amplitudes $B_n$, and harmonic-space power spectra $C_\ell^{BB}$.
+    For each statistic, we compute probability-to-exceed (PTE) values over a two-dimensional grid of scale-cut boundaries; our adopted cuts lie in broad stable regions of acceptable PTE.
+    $B$-mode detections and PTE failures on initial catalog versions led us to investigate galaxy size cuts and stellar halo masking.
+    After cuts, all three statistics pass the null test (minimum PTE $= \num{0.18}$).
+    Before scale cuts, we measure an oscillatory COSEBI $B$-mode pattern consistent with repeating additive shear bias, a detector-level effect seen across multiple Stage-III surveys including CFHTLenS, which used the same MegaCam camera; scale cuts that exclude the charge-coupled device (CCD) angular scale suppress it.
+    Although these statistics probe the same two-point shear field, scale cuts in one do not map exactly onto cuts in another, because their respective filter functions weight angular scales differently.
+    The most conservative validation therefore requires scale and sample selections that pass null tests across all frameworks simultaneously, an approach that applies directly to Stage-IV surveys where systematic errors dominate.
+  findings: |
+    The paper validates the UNIONS-3500 weak-lensing B-mode surface across three estimators: the [adopted cuts pass all three null tests](#findings.adopted_cuts_pass_all_statistics), [the conclusion restates consistency with zero at those cuts](#findings.bmodes_consistent_with_zero_at_adopted_cuts), and [the cuts remain acceptable under a two-parameter scale-cut accounting](#findings.scale_cut_degrees_of_freedom_still_pass). The central systematic is an [oscillatory full-range COSEBI pattern consistent with repeating additive shear bias](#findings.full_range_cosebi_repeating_additive_pattern). The paper also shows that [only the fiducial catalog passes in every representation](#findings.fiducial_catalog_only_full_pass) and that [COSEBI versus harmonic-space disagreement is driven by filter-function sensitivity](#findings.filter_functions_drive_representation_disagreement).
+
+inputs: []
+outputs: []
+
+findings:
+  adopted_cuts_pass_all_statistics:
+    id: adopted_cuts_pass_all_statistics
+    label: "Adopted cuts pass all three B-mode null tests"
+    claim: |
+      After galaxy-size cuts and stellar-halo masking choices, the adopted UNIONS-3500 scale cuts pass pure-mode correlation-function, COSEBI, and harmonic-space B-mode null tests, with a minimum PTE of 0.18.
+    created_at: "2026-05-08T03:32:00+02:00"
+    evidence:
+      - id: abstract_minimum_pte
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "After cuts, all three statistics pass the null test (minimum PTE = 0.18)."
+
+  bmodes_consistent_with_zero_at_adopted_cuts:
+    id: bmodes_consistent_with_zero_at_adopted_cuts
+    label: "B modes are consistent with zero at the adopted scale cuts"
+    claim: |
+      In the paper's conclusion, the adopted scale cuts leave B modes consistent with zero across pure-mode correlation functions, COSEBIs, and harmonic-space power spectra.
+    created_at: "2026-05-08T03:32:00+02:00"
+    evidence:
+      - id: conclusion_consistency
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "UNIONS-3500 weak-lensing B modes are consistent with zero at the adopted scale cuts across pure-mode correlation functions, COSEBIs, and harmonic-space power spectra."
+
+  scale_cut_degrees_of_freedom_still_pass:
+    id: scale_cut_degrees_of_freedom_still_pass
+    label: "Scale-cut degree-of-freedom accounting still passes"
+    claim: |
+      Treating the selected scale-cut boundaries as two fitted parameters lowers the minimum PTE from 0.18 to 0.09, which remains above the 0.05 failure threshold.
+    created_at: "2026-05-08T03:32:00+02:00"
+    evidence:
+      - id: pte_two_parameter_accounting
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "Doing so lowers the minimum PTE across all statistics from 0.18 to 0.09, still above the 0.05 threshold."
+
+  full_range_cosebi_repeating_additive_pattern:
+    id: full_range_cosebi_repeating_additive_pattern
+    label: "Full-range COSEBIs show repeating-additive pattern"
+    claim: |
+      Before the adopted cuts, all catalog versions show an oscillatory full-range COSEBI B-mode pattern consistent with repeating additive shear bias at CCD angular scales.
+    created_at: "2026-05-08T03:32:00+02:00"
+    evidence:
+      - id: conclusion_full_range_pattern
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "On the full angular range, all catalog versions show an oscillatory COSEBI B-mode pattern consistent with repeating additive shear bias at CCD angular scales"
+
+  fiducial_catalog_only_full_pass:
+    id: fiducial_catalog_only_full_pass
+    label: "Only the fiducial catalog passes every representation"
+    claim: |
+      Of the four catalog variants tested, only the fiducial size-cut catalog passes the full set of B-mode validation representations at the adopted cuts.
+    created_at: "2026-05-08T03:32:00+02:00"
+    evidence:
+      - id: conclusion_fiducial_only
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "Of the four catalog variants tested, only the fiducial passes in every representation."
+
+  filter_functions_drive_representation_disagreement:
+    id: filter_functions_drive_representation_disagreement
+    label: "Filter functions explain representation disagreement"
+    claim: |
+      The paper argues that COSEBI versus harmonic-space disagreement is not a real-space versus harmonic-space basis effect; COSEBI filter functions concentrate sensitivity on contaminated angular scales.
+    created_at: "2026-05-08T03:32:00+02:00"
+    evidence:
+      - id: discussion_harmonic_cosebi_comparison
+        doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "By computing COSEBIs from the harmonic-space bandpowers, we confirm that the disagreement is not a matter of harmonic versus real space"
diff --git a/claude/lightcone/skills/paper-extraction/references/arxiv-source.md b/claude/lightcone/skills/paper-extraction/references/arxiv-source.md
new file mode 100644
index 00000000..0d4515a0
--- /dev/null
+++ b/claude/lightcone/skills/paper-extraction/references/arxiv-source.md
@@ -0,0 +1,47 @@
+# Path A — arXiv LaTeX source (primary)
+
+When the paper has an arXiv ID, the LaTeX source tarball is the substrate. Math, ligatures, captions, tables, and bibliography all come through clean — none of the rendering artifacts that plague PDF extraction.
+
+## Acquire the source tarball
+
+```bash
+ARXIV_ID="2503.19441"  # adapt
+curl -L -o /tmp/${ARXIV_ID}.tar.gz "https://arxiv.org/src/${ARXIV_ID}"
+mkdir -p work/reference/source
+cd work/reference/source && tar -xzf /tmp/${ARXIV_ID}.tar.gz
+```
+
+Identify the main `.tex` file (the one with `\documentclass`):
+
+```bash
+grep -l '\\documentclass' work/reference/source/*.tex | head -1
+```
+
+Symlink that file as `work/reference/paper.tex` so downstream consumers have a stable handle:
+
+```bash
+MAIN_TEX=$(grep -l '\\documentclass' work/reference/source/*.tex | head -1)
+ln -sf "source/$(basename "$MAIN_TEX")" work/reference/paper.tex
+```
+
+## Fetch the PDF
+
+```bash
+curl -L -o work/reference/paper.pdf "https://arxiv.org/pdf/${ARXIV_ID}"
+file work/reference/paper.pdf  # must say "PDF document"
+```
+
+## What downstream gets
+
+- `work/reference/source/` — the full extracted tarball (everything: `.tex`, `.bbl`, `.bib`, figure files, tables, supplementary `.tex` files).
+- `work/reference/paper.tex` — symlink to the main `.tex` file so consumers don't have to re-detect it.
+- `work/reference/paper.pdf` — cached PDF for evidence verification.
+
+No conversion to markdown is needed. Claude reads LaTeX directly; converting to markdown only loses information (math collapse, label resolution, caption flattening). Consumers of `work/reference/` read `.tex` and resolve `\ref{}` against `\label{}` in the source tree.
+
+## Notes
+
+- **arXiv DOI form is `10.48550/arXiv.<id>`.** Useful when downstream tools want a DOI rather than an arXiv ID.
+- **Equation numbers and section numbers must match the rendered paper.** When a downstream consumer cites "eq. N" or "§N", they should find the equation by content, not by counting TeX blocks. Reach for the cached PDF if you need to confirm a printed number.
+- **`\input{}` and `\include{}` chains** are common — the main `.tex` may pull section content from sibling files. Downstream consumers should grep across the whole `source/` tree, not just `paper.tex`, when searching for content.
+- **If the tarball download fails** (rare: typically a transient HTTP error or a paper still in moderation), retry once. If it still fails, the paper may need to come in as Path B (DOI-only). Write `work/reference/extraction-error.txt` with the cause and surface to the user.
diff --git a/claude/lightcone/skills/paper-extraction/references/pdf-fallback.md b/claude/lightcone/skills/paper-extraction/references/pdf-fallback.md
new file mode 100644
index 00000000..b2b8207b
--- /dev/null
+++ b/claude/lightcone/skills/paper-extraction/references/pdf-fallback.md
@@ -0,0 +1,66 @@
+# Path B — PDF + Docling (fallback for non-arXiv)
+
+When the paper does not have an arXiv preprint, the PDF is the only substrate. Docling produces a structured representation (markdown + figures + tables + metadata) that downstream consumers read instead of the raw PDF.
+
+This path is a **fallback**. Whenever Path A is available, prefer it.
+
+## Acquire the PDF
+
+Resolve the DOI to a PDF. The straightforward path:
+
+```bash
+curl -L -o work/reference/paper.pdf "https://doi.org/<DOI>"
+file work/reference/paper.pdf
+```
+
+The `file` output must say "PDF document". If it says "HTML document" or anything else, the download was blocked (CAPTCHA, paywall, journal redirect):
+
+1. Search for an open-access copy: NASA ADS, arXiv, Unpaywall, Semantic Scholar, or the journal's open-access link.
+2. Download with `curl -L -o work/reference/paper.pdf <url>`.
+3. Re-check with `file work/reference/paper.pdf`.
+
+If a valid PDF cannot be obtained, write a clear error to `work/reference/extraction-error.txt` and stop. Do not try to extract structure from a non-PDF.
+
+## Run Docling
+
+```bash
+docling --output work/reference work/reference/paper.pdf
+```
+
+Docling produces, directly into `work/reference/`:
+
+- `document.md` — paper as markdown
+- `figures/` — extracted figures (one file per figure)
+- `tables/` — extracted tables (one file per table)
+- `metadata.json` — figure / table index with captions, page numbers, and labels (where Docling can extract them)
+
+The `metadata.json` shape Docling emits:
+
+```json
+{
+  "figures": [
+    {"id": "fig1", "caption": "...", "file": "figures/fig1.pdf", "label": "fig:bao"}
+  ],
+  "tables": [
+    {"id": "tab1", "caption": "...", "file": "tables/tab1.csv", "label": "tab:results"}
+  ]
+}
+```
+
+The `label` field is the source label where Docling can extract it; consumers reading `index.json` use it to anchor references back to the paper.
+
+If Docling fails, the PDF may be corrupt — re-download once, then surface to the user if it still fails.
+
+## What downstream gets
+
+- `work/reference/document.md` — paper as markdown.
+- `work/reference/figures/`, `work/reference/tables/` — already populated by Docling.
+- `work/reference/metadata.json` — Docling's own index; the extraction script reads this and folds figures + tables into the unified `work/reference/index.json`.
+- `work/reference/paper.pdf` — the PDF.
+
+No `paper.tex` and no `source/` on Path B. Consumers detect the path by reading `index.json`'s `path` field (`"A"` or `"B"`).
+
+## Notes
+
+- **Outline extraction and citation-invocation extraction don't run on Path B.** No LaTeX source means no `\section{}` or `\cite{}` markers to walk in the paper body. Bibliography resolution *does* run — the script parses the references section at the tail of `document.md`, synthesizes `<lastname>_<year>` keys (with letter-suffix disambiguation for collisions), and resolves DOIs the same way as Path A. So the `citations:` block is populated with citation text + DOI, but each entry's `locations:` array is empty (the paper-side `\cite`-style invocations weren't extracted from prose). `extraction_warnings` flags both gaps.
+- **Journal DOIs that 403 on Unpaywall** sometimes have an arXiv preprint twin. When that's available, treat the paper as Path A using the arXiv ID — the LaTeX-source surface is far cleaner than any PDF extraction.
diff --git a/claude/lightcone/skills/paper-extraction/scripts/extract-paper-substrate.py b/claude/lightcone/skills/paper-extraction/scripts/extract-paper-substrate.py
new file mode 100755
index 00000000..ce2309ee
--- /dev/null
+++ b/claude/lightcone/skills/paper-extraction/scripts/extract-paper-substrate.py
@@ -0,0 +1,1475 @@
+#!/usr/bin/env python3
+"""
+extract-paper-substrate.py — deterministic structural extraction for the
+paper-extraction skill.
+
+Reads `work/reference/` and produces:
+
+  - figures/                        # figure files copied from source/
+  - tables/<label-slug>.tex         # one file per LaTeX table block
+  - bibliography-source.bib         # copy of any .bib found in source/ (Path A only)
+  - bibliography-source.bbl         # copy of any .bbl found in source/ (Path A only)
+  - .doi-cache.json                 # Crossref/ADS lookup cache for re-run idempotency
+  - index.json                      # single top-level index of everything extracted
+
+Path A (arXiv LaTeX source): reads from work/reference/source/.
+Path B (Docling fallback):   reads from work/reference/document.md and Docling's
+                             pre-existing figures/ + tables/ + metadata.json.
+
+The script handles only the deterministic pieces. Semantic interpretation —
+"what does this figure show", "which findings are central", numerical-claim
+extraction — is the agent's job after this script runs. The agent reads
+index.json (specifically extraction_warnings) and fixes or surfaces gaps.
+
+`index.json`'s `citations:` block enriches the cite-key → location mapping
+with each cited paper's full text + resolved DOI, so downstream consumers
+can do citation-key lookups for a paper's bibliography directly (no separate
+cited-papers index file).
+
+Usage:
+    python extract-paper-substrate.py [--reference-dir work/reference]
+
+Idempotent — skips files that already exist; cached DOI lookups don't
+re-hit the network on re-runs.
+"""
+
+import argparse
+import hashlib
+import json
+import os
+import re
+import shutil
+import sys
+import urllib.error
+import urllib.parse
+import urllib.request
+from difflib import SequenceMatcher
+from importlib.metadata import version as _pkg_version
+from pathlib import Path
+
+
+# ---------------------------------------------------------------------------
+# Patterns
+# ---------------------------------------------------------------------------
+
+FIGURE_BLOCK = re.compile(r"\\begin\{figure\*?\}(.*?)\\end\{figure\*?\}", re.DOTALL)
+# Tables: include AAS-specific `deluxetable` (ApJ, ApJL, ApJS) alongside the standard `table`.
+TABLE_BLOCK = re.compile(
+    r"\\begin\{(?:table|deluxetable)\*?\}(.*?)\\end\{(?:table|deluxetable)\*?\}",
+    re.DOTALL,
+)
+ABSTRACT_BLOCK = re.compile(r"\\begin\{abstract\}(.*?)\\end\{abstract\}", re.DOTALL)
+TITLE_CMD = re.compile(r"\\title\*?\s*(?:\[[^\]]*\])?\s*\{")
+# Citations: natbib family + biblatex (autocite, textcite, parencite, footcite, smartcite).
+CITE = re.compile(
+    r"\\(?:cite|citep|citet|citealp|citealt|citeauthor|citeyear|citeyearpar|"
+    r"autocite|textcite|parencite|footcite|smartcite)\*?"
+    r"(?:\[[^\]]*\]){0,2}\{([^}]+)\}"
+)
+# Derived from the installed astra-spec package so the stub `astra.yaml` always
+# stamps the version actually present in the environment — `astra validate` will
+# warn if the analysis declares a version the installed astra-spec can't honour.
+# Let PackageNotFoundError propagate: this script ships with lightcone-cli, which
+# depends on astra-spec, so a missing install is a real bug we want loud.
+ASTRA_SCHEMA_VERSION = _pkg_version("astra-spec")
+
+# Bump when the structural shape of `index.json` changes in a backwards-incompatible
+# way (a new key added is fine; renaming/reshaping an existing value breaks consumers).
+# v1: introduced explicit versioning; `citations:` value shape transitioned from
+#     `key -> [locations]` to `key -> {locations, citation, doi}`.
+INDEX_SCHEMA_VERSION = 1
+
+CROSSREF_API = "https://api.crossref.org/works"
+CROSSREF_USER_AGENT = (
+    "paper-extraction (https://github.com/LightconeResearch/lightcone-cli; "
+    "mailto:cailmdaley@gmail.com)"
+)
+ADS_API = "https://api.adsabs.harvard.edu/v1/search/query"
+NETWORK_TIMEOUT_S = 10
+# Match caption commands; the body itself is walked with balanced-brace logic so
+# nested braces and escaped braces survive intact.
+CAPTION = re.compile(r"\\caption\*?\s*(?:\[[^\]]*\])?\s*\{")
+LABEL = re.compile(r"\\label\{([^}]+)\}")
+INCLUDEGRAPHICS = re.compile(r"\\includegraphics(?:\[[^\]]*\])?\{([^}]+)\}")
+PLOTONE = re.compile(r"\\plotone\{([^}]+)\}")
+PLOTTWO = re.compile(r"\\plottwo\{([^}]+)\}\{([^}]+)\}")
+FIGURE_INPUT = re.compile(r"\\input\{([^}]+\.(?:pgf|tex|tikz))\}")
+SECTION = re.compile(r"\\(section|subsection|subsubsection)\*?\{((?:[^{}]|\{[^}]*\})*)\}")
+
+
+def line_at(content: str, offset: int) -> int:
+    """1-indexed line number of `offset` within `content`."""
+    return content.count("\n", 0, offset) + 1
+
+
+def first_match(pattern: re.Pattern, text: str) -> str | None:
+    m = pattern.search(text)
+    return m.group(1).strip() if m else None
+
+
+def extract_caption(text: str, macros: dict[str, str]) -> str:
+    """Return the last non-empty caption in a block.
+
+    Composite figures often have empty subfigure captions before the real
+    top-level caption; taking the first caption produces a false warning.
+    Balanced-brace walking preserves nested LaTeX and escaped braces inside
+    caption bodies.
+    """
+    captions = []
+    for match in CAPTION.finditer(text):
+        body = walk_balanced_braces(text, match.end() - 1)
+        if body is not None:
+            captions.append(body.strip())
+    nonempty = [caption for caption in captions if caption]
+    return expand_macros(nonempty[-1], macros) if nonempty else ""
+
+
+# ---------------------------------------------------------------------------
+# Path detection
+# ---------------------------------------------------------------------------
+
+
+def detect_path(reference_dir: Path) -> str:
+    if (reference_dir / "source").is_dir():
+        return "A"
+    if (reference_dir / "document.md").is_file():
+        return "B"
+    sys.exit(
+        f"error: neither {reference_dir}/source/ nor {reference_dir}/document.md exists "
+        f"— run paper-extraction Step 1 (substrate acquisition) first"
+    )
+
+
+# ---------------------------------------------------------------------------
+# Path A — LaTeX source
+# ---------------------------------------------------------------------------
+
+
+def list_tex_files(source_dir: Path) -> list[Path]:
+    return sorted(source_dir.rglob("*.tex"))
+
+
+# A `%` not preceded by `\\` starts a LaTeX comment running to end-of-line.
+# We strip comment *content* but keep the `\n` so line numbers are preserved.
+COMMENT = re.compile(r"(?<!\\)%[^\n]*")
+
+
+def strip_comments(content: str) -> str:
+    """Strip LaTeX comments (line content after unescaped `%`), preserving newlines."""
+    return COMMENT.sub("", content)
+
+
+# Match `\newcommand[*]{\name}{body}` — no-args form only. Args (`[2]`) are skipped.
+NEWCOMMAND = re.compile(
+    r"\\(?:newcommand|renewcommand|providecommand)\*?\s*\{?\s*\\([A-Za-z]+)\s*\}?\s*\{",
+)
+
+
+def collect_simple_macros(tex_files: list[tuple[Path, str]]) -> dict[str, str]:
+    """Build a `\\name -> body` dict for no-arg `\\newcommand` macros across the source.
+
+    Skips macros with arguments (e.g. `\\newcommand{\\foo}[2]{...}`) — handling those
+    requires expansion, which is out of scope. Skips macros whose body is the same as
+    their name (e.g. `\\newcommand{\\foo}{\\foo}`) which would loop.
+    """
+    macros: dict[str, str] = {}
+    for _, content in tex_files:
+        for match in NEWCOMMAND.finditer(content):
+            name = match.group(1)
+            # Walk balanced braces to find the body.
+            body = walk_balanced_braces(content, match.end() - 1)
+            if body is None:
+                continue
+            # Skip if there's an arg-count specifier between name and body:
+            # we already consumed up to the body's opening `{`, so this regex
+            # can match args-form too. Detect by checking if body looks like
+            # an args spec — actually simpler: check if `[N]` lies between
+            # name end and body start in the original source.
+            between_start = match.end(1)
+            between_end = match.end() - 1
+            between = content[between_start:between_end]
+            if re.search(r"\[\s*\d+\s*\]", between):
+                continue  # args-form, skip
+            if body.strip() == f"\\{name}":
+                continue  # self-referential
+            macros[name] = body
+    return macros
+
+
+def expand_macros(text: str, macros: dict[str, str], max_iterations: int = 5) -> str:
+    """Substitute `\\name` (where name is in `macros`) iteratively. Stops at fixed point or
+    `max_iterations` (handles nested macros, prevents infinite loops on pathological input).
+    """
+    if not text or not macros:
+        return text
+    # Match `\name` where name is in our table. Order longest-first so `\desidrone`
+    # wins over `\desi` if both exist.
+    names = sorted(macros.keys(), key=len, reverse=True)
+    pattern = re.compile(r"\\(" + "|".join(re.escape(n) for n in names) + r")(?![A-Za-z])")
+    out = text
+    for _ in range(max_iterations):
+        new = pattern.sub(lambda m: macros[m.group(1)], out)
+        if new == out:
+            return out
+        out = new
+    return out
+
+
+def read_tex_with_origin(source_dir: Path) -> list[tuple[Path, str]]:
+    """Read each .tex file (stripped of comments) in *paper-reading order*.
+
+    Order is determined by walking the main file's `\\input{}` / `\\include{}` chain.
+    The main file is the one containing `\\documentclass`. Files not reachable from
+    the input chain are appended at the end (alphabetical) as orphans.
+
+    Comments are stripped at read time to prevent commented-out LaTeX from leaking
+    into figure / table / section / citation extraction. Newlines are preserved so
+    line numbers are still meaningful.
+    """
+    paths = list_tex_files(source_dir)
+    if not paths:
+        return []
+
+    contents: dict[Path, str] = {}
+    for p in paths:
+        try:
+            contents[p] = strip_comments(p.read_text(errors="replace"))
+        except OSError as e:
+            print(f"warn: could not read {p}: {e}", file=sys.stderr)
+
+    # Find the main file (contains \documentclass, after comment stripping).
+    main = next((p for p in paths if r"\documentclass" in contents.get(p, "")), None)
+    if main is None:
+        # No main file detected — fall back to alphabetical order.
+        return [(p, contents[p]) for p in paths if p in contents]
+
+    # Map basename (without extension) → path, for resolving \input{name} or \input{path/name}.
+    by_stem: dict[str, Path] = {}
+    for p in paths:
+        by_stem.setdefault(p.stem, p)
+
+    INPUT_CMD = re.compile(r"\\(?:input|include)\{([^}]+)\}")
+    ordered: list[Path] = []
+    seen: set[Path] = set()
+
+    def walk(p: Path) -> None:
+        if p in seen or p not in contents:
+            return
+        seen.add(p)
+        ordered.append(p)
+        for match in INPUT_CMD.finditer(contents[p]):
+            target = match.group(1).strip()
+            target = target.removesuffix(".tex")
+            stem = Path(target).stem  # last path component, no extension
+            sub = by_stem.get(stem)
+            if sub is not None:
+                walk(sub)
+
+    walk(main)
+    # Append unreached files (orphans — supplementary, unused, etc.) at the end.
+    for p in paths:
+        if p not in seen and p in contents:
+            ordered.append(p)
+
+    return [(p, contents[p]) for p in ordered]
+
+
+def join_tex(tex_files: list[tuple[Path, str]]) -> str:
+    return "\n".join(content for _, content in tex_files)
+
+
+def extract_figures(
+    reference_dir: Path,
+    source_dir: Path,
+    tex_files: list[tuple[Path, str]],
+    macros: dict[str, str],
+) -> tuple[list[dict], list[str]]:
+    """Walk every figure block; copy resolved figure files; return (entries, warnings)."""
+    fig_dir = reference_dir / "figures"
+    fig_dir.mkdir(exist_ok=True)
+    entries: list[dict] = []
+    warnings: list[str] = []
+    counter = 0
+
+    for tex_path, content in tex_files:
+        for match in FIGURE_BLOCK.finditer(content):
+            counter += 1
+            block = match.group(1)
+            caption = extract_caption(block, macros)
+            label = first_match(LABEL, block)
+
+            # Capture every external figure reference in the block. Besides
+            # \includegraphics, AASTeX/emulateapj papers often use \plotone /
+            # \plottwo, while ML papers often \input Matplotlib/PGF exports.
+            # Multi-panel / subfloat figures routinely have several.
+            graphic_matches = external_figure_refs(block)
+            files_rel: list[str] = []
+            for graphic in graphic_matches:
+                resolved = resolve_graphic(source_dir, graphic)
+                if resolved:
+                    dest = fig_dir / resolved.name
+                    if not dest.exists():
+                        shutil.copy2(resolved, dest)
+                    files_rel.append(f"figures/{resolved.name}")
+                else:
+                    warnings.append(
+                        f"figure fig{counter}: \\includegraphics{{{graphic}}} could not resolve to a file in source/"
+                    )
+
+            inline_figure = bool(re.search(r"\\begin\{(?:tikzpicture|picture|pspicture)\}", block))
+            if not graphic_matches and not inline_figure:
+                warnings.append(f"figure fig{counter}: no external figure file found in block")
+            if not caption:
+                warnings.append(f"figure fig{counter}: no \\caption found")
+
+            entries.append(
+                {
+                    "id": f"fig{counter}",
+                    "label": label,
+                    "caption": caption,
+                    # Single-graphic figures keep the simple shape (the common case);
+                    # multi-graphic figures expose all panels under "files".
+                    "source_path": graphic_matches[0] if graphic_matches else None,
+                    "file": files_rel[0] if files_rel else None,
+                    "files": files_rel if len(files_rel) > 1 else None,
+                    "block_origin": str(tex_path.relative_to(source_dir)),
+                    "line": line_at(content, match.start()),
+                }
+            )
+
+    return entries, warnings
+
+
+def external_figure_refs(block: str) -> list[str]:
+    """Return external figure-like files referenced inside a figure block."""
+    refs: list[str] = []
+    refs.extend(INCLUDEGRAPHICS.findall(block))
+    refs.extend(PLOTONE.findall(block))
+    for first, second in PLOTTWO.findall(block):
+        refs.extend([first, second])
+    refs.extend(FIGURE_INPUT.findall(block))
+    # Preserve order while de-duplicating repeated panels.
+    seen: set[str] = set()
+    out = []
+    for ref in refs:
+        if ref not in seen:
+            seen.add(ref)
+            out.append(ref)
+    return out
+
+
+def resolve_graphic(source_dir: Path, graphic: str) -> Path | None:
+    """LaTeX \\includegraphics filenames can omit the extension; try common ones."""
+    base = source_dir / graphic
+    if base.exists():
+        return base
+    for ext in (".pdf", ".png", ".jpg", ".jpeg", ".eps"):
+        candidate = base.with_suffix(ext)
+        if candidate.exists():
+            return candidate
+    matches = list(source_dir.rglob(f"{Path(graphic).stem}.*"))
+    return matches[0] if matches else None
+
+
+def extract_tables(
+    reference_dir: Path,
+    tex_files: list[tuple[Path, str]],
+    source_dir: Path,
+    macros: dict[str, str],
+) -> tuple[list[dict], list[str]]:
+    tab_dir = reference_dir / "tables"
+    tab_dir.mkdir(exist_ok=True)
+    entries: list[dict] = []
+    warnings: list[str] = []
+    counter = 0
+
+    for tex_path, content in tex_files:
+        for match in TABLE_BLOCK.finditer(content):
+            counter += 1
+            block = match.group(0)  # full \begin{table}...\end{table}
+            body = match.group(1)
+            label = first_match(LABEL, body)
+            caption = extract_caption(body, macros)
+            slug = label.replace(":", "-").replace(" ", "_") if label else f"tab{counter}"
+            out = tab_dir / f"{slug}.tex"
+            if not out.exists():
+                out.write_text(block)
+            if not caption:
+                warnings.append(f"table tab{counter}: no \\caption found")
+            if not label:
+                warnings.append(f"table tab{counter}: no \\label — wrote as {slug}.tex")
+            entries.append(
+                {
+                    "id": f"tab{counter}",
+                    "label": label,
+                    "caption": caption,
+                    "file": f"tables/{slug}.tex",
+                    "block_origin": str(tex_path.relative_to(source_dir)),
+                    "line": line_at(content, match.start()),
+                }
+            )
+
+    return entries, warnings
+
+
+def extract_outline(
+    tex_files: list[tuple[Path, str]], source_dir: Path, macros: dict[str, str]
+) -> list[dict]:
+    """Walk \\section{}, \\subsection{}, \\subsubsection{} in source order.
+
+    Attach a \\label{} only when it directly follows the section command (whitespace
+    between is fine, but no other content). The convention is `\\section{Foo}\\label{sec:foo}`
+    or with one newline between — anything more, and the label belongs elsewhere.
+    """
+    level_map = {"section": 1, "subsection": 2, "subsubsection": 3}
+    immediate_label = re.compile(r"\A\s*\\label\{([^}]+)\}")
+    out = []
+    for tex_path, content in tex_files:
+        for match in SECTION.finditer(content):
+            kind, title = match.group(1), expand_macros(match.group(2).strip(), macros)
+            tail = content[match.end() : match.end() + 200]
+            label_match = immediate_label.match(tail)
+            label = label_match.group(1) if label_match else None
+            out.append(
+                {
+                    "level": level_map[kind],
+                    "title": title,
+                    "label": label,
+                    "source_file": str(tex_path.relative_to(source_dir)),
+                    "line": line_at(content, match.start()),
+                }
+            )
+    return out
+
+
+def extract_citations(
+    tex_files: list[tuple[Path, str]], source_dir: Path
+) -> dict[str, list[dict]]:
+    """Map each citation key to every (file, line) location it's cited.
+
+    Shape: {"smith24": [{"file": "main.tex", "line": 42}, {"file": "main.tex", "line": 89}], ...}
+    """
+    out: dict[str, list[dict]] = {}
+    for tex_path, content in tex_files:
+        rel_file = str(tex_path.relative_to(source_dir))
+        for match in CITE.finditer(content):
+            line = line_at(content, match.start())
+            for key in match.group(1).split(","):
+                k = key.strip()
+                if not k:
+                    continue
+                out.setdefault(k, []).append({"file": rel_file, "line": line})
+    # Sort keys for stable output
+    return {k: out[k] for k in sorted(out)}
+
+
+def walk_balanced_braces(content: str, start: int) -> str | None:
+    """Given the index of the opening `{`, return the content between matched
+    braces (exclusive of the braces themselves), or None if unbalanced.
+    Honors escaped braces (`\\{`, `\\}`).
+    """
+    depth = 1
+    i = start + 1
+    while i < len(content) and depth > 0:
+        c = content[i]
+        if c == "\\" and i + 1 < len(content):
+            i += 2  # skip escaped char
+            continue
+        if c == "{":
+            depth += 1
+        elif c == "}":
+            depth -= 1
+        i += 1
+    if depth == 0:
+        return content[start + 1 : i - 1]
+    return None
+
+
+def extract_abstract(tex_files: list[tuple[Path, str]], macros: dict[str, str]) -> str | None:
+    """Extract abstract content. Supports two LaTeX forms:
+
+    - environment: `\\begin{abstract}...\\end{abstract}` (most journals)
+    - command:    `\\abstract{...}` (A&A's aa.cls and similar)
+    """
+    for _, content in tex_files:
+        # Form 1: environment
+        match = ABSTRACT_BLOCK.search(content)
+        if match:
+            return expand_macros(match.group(1).strip(), macros)
+
+        # Form 2: command — balanced-brace walk
+        cmd = re.search(r"\\abstract\s*\{", content)
+        if cmd:
+            body = walk_balanced_braces(content, cmd.end() - 1)
+            if body is not None:
+                return expand_macros(body.strip(), macros)
+    return None
+
+
+def extract_title(tex_files: list[tuple[Path, str]], macros: dict[str, str]) -> str | None:
+    """Extract \\title{...} (or \\title[short]{full}) content with balanced braces."""
+    for _, content in tex_files:
+        match = TITLE_CMD.search(content)
+        if match:
+            body = walk_balanced_braces(content, match.end() - 1)
+            if body is not None:
+                expanded = expand_macros(" ".join(body.split()), macros)
+                # Strip common font-style wrappers that a `\\boldmath`-prefixed title
+                # leaves behind after macro expansion (no-op if not present).
+                expanded = re.sub(r"^\\boldmath\s*", "", expanded)
+                return expanded
+    return None
+
+
+def derive_astra_id(arxiv_id: str | None, doi: str | None) -> str:
+    """Stable ASTRA id from arXiv ID or DOI. Lowercase, [a-z0-9_]+, leading letter."""
+    if arxiv_id:
+        slug = "arxiv_" + arxiv_id.replace(".", "_").replace("/", "_").lower()
+    elif doi:
+        slug = "doi_" + re.sub(r"[^a-z0-9]+", "_", doi.lower()).strip("_")
+    else:
+        slug = "paper_unknown"
+    # Ensure leading letter, only [a-z0-9_]
+    slug = re.sub(r"[^a-z0-9_]+", "_", slug)
+    if not slug or not slug[0].isalpha():
+        slug = "paper_" + slug
+    return slug
+
+
+def write_astra_yaml_stub(
+    reference_dir: Path,
+    arxiv_id: str | None,
+    doi: str | None,
+    title: str | None,
+    abstract: str | None,
+) -> str:
+    """Emit a stub `work/reference/astra.yaml` that the agent fills in.
+
+    The script populates: id, version, name, narrative.summary (from abstract),
+    inputs/outputs as empty lists, and an empty findings map. The agent's job
+    (Step 5 in SKILL.md) is to walk the paper and append findings entries with
+    quote evidence, plus a `narrative.findings:` cross-link. Once that's in,
+    `astra validate work/reference/astra.yaml` should pass.
+
+    If the file already exists, leave it alone — it may have agent edits.
+    """
+    out = reference_dir / "astra.yaml"
+    if out.exists():
+        return "astra.yaml"
+
+    astra_id = derive_astra_id(arxiv_id, doi)
+    title_str = title or "TODO: paper title (script could not extract \\title{})"
+    summary_str = abstract or "TODO: one-paragraph summary of the paper (no abstract extracted)"
+
+    # Indent the summary as a block scalar so multi-line abstracts round-trip
+    summary_indented = "\n".join("    " + line for line in summary_str.splitlines())
+
+    content = f"""# Stub ASTRA representation of the source paper.
+#
+# Populated by paper-extraction's script: id, version, name, narrative.summary.
+# The agent (paper-extraction Step 5) fills in `findings:` with the paper's
+# claimed numerical results plus a `narrative.findings:` cross-link, then runs
+# `astra validate astra.yaml` to confirm.
+
+id: {astra_id}
+version: "{ASTRA_SCHEMA_VERSION}"
+name: {json.dumps(title_str)}
+
+narrative:
+  summary: |
+{summary_indented}
+
+inputs: []
+outputs: []
+
+# Agent: append entries here, one per central numerical claim the paper makes.
+# Shape: see https://w3id.org/ASTRA/insight (Insight + Evidence). Minimal entry:
+#
+#   <id>:
+#     id: <id>
+#     claim: "<1-2 sentences capturing the result>"
+#     created_at: "<ISO 8601 datetime>"
+#     evidence:
+#       - id: <evidence_id>
+#         doi: "<paper DOI>"
+#         version: <paper version, integer>
+#         quote:
+#           exact: "<exact text from the paper that supports the claim>"
+findings: {{}}
+"""
+    out.write_text(content)
+    return "astra.yaml"
+
+
+def copy_embedded_bibliography(reference_dir: Path, source_dir: Path) -> tuple[str | None, str | None]:
+    """Copy any .bib / .bbl files from source/ into work/reference/."""
+    bib_src = next(iter(source_dir.rglob("*.bib")), None)
+    bbl_src = next(iter(source_dir.rglob("*.bbl")), None)
+
+    bib_rel = None
+    bbl_rel = None
+    if bib_src:
+        dest = reference_dir / "bibliography-source.bib"
+        if not dest.exists():
+            shutil.copy2(bib_src, dest)
+        bib_rel = "bibliography-source.bib"
+    if bbl_src:
+        dest = reference_dir / "bibliography-source.bbl"
+        if not dest.exists():
+            shutil.copy2(bbl_src, dest)
+        bbl_rel = "bibliography-source.bbl"
+    return bib_rel, bbl_rel
+
+
+# ---------------------------------------------------------------------------
+# Bibliography resolution — shared by Path A (.bib/.bbl) and Path B (Docling)
+# ---------------------------------------------------------------------------
+#
+# Produces a list of bibliography entries, each `{key, citation, doi}`, that
+# downstream joins against `extract_citations()`'s `{key: [locations]}` to enrich
+# the `citations:` block in `index.json`.
+#
+# Path A: parse `bibliography-source.bib` first, fall back to `.bbl`. Keys come
+# from BibTeX directly (case-sensitive, unique per-paper, identical to what the
+# tex source's `\cite{}` invocations reference).
+#
+# Path B: parse the references section at the tail of `document.md`. Docling has
+# no \cite{} markers in the prose so we synthesize keys from first-author + year
+# (`asgari_2017`, disambiguated with letter suffixes when needed). The synthetic
+# keys carry no `locations:` entries — citation invocations from rendered prose
+# are a separate extraction problem flagged in `extraction_warnings`.
+
+
+# Parse @type{key, field = value, ...} entries. Skip @comment, @preamble, @string.
+BIB_ENTRY_HEAD = re.compile(r"@(\w+)\s*\{\s*([^,\s]+)\s*,", re.IGNORECASE)
+DOI_IN_TEXT = re.compile(r"\b10\.\d{4,9}/[-._;()/:A-Z0-9]+", re.IGNORECASE)
+ARXIV_ID_IN_TEXT = re.compile(
+    r"(?:arXiv:|astro-ph/|hep-(?:th|ph)/|gr-qc/|cond-mat/|math/|cs\.[A-Z]{2}/)"
+    r"\s*([a-zA-Z0-9\-./]+)",
+    re.IGNORECASE,
+)
+# Newer-format arXiv IDs without prefix: 4 digits, dot, 4-5 digits, optional vN
+ARXIV_BARE = re.compile(r"\b(\d{4}\.\d{4,5})(?:v\d+)?\b")
+
+
+def parse_bib(content: str) -> list[dict]:
+    """Parse BibTeX content into a list of entries.
+
+    Each entry: `{"type": str, "key": str, "fields": {<lowercased-field>: <stripped-value>}}`.
+    Skips `@comment`, `@preamble`, `@string` (handling string macros properly would require
+    substitution; for our enrichment purposes we can live without it).
+    Field values are unwrapped from `{...}` or `"..."` and have surrounding whitespace stripped.
+    Doesn't try to interpret LaTeX accents or commands — keeps them verbatim so re-running
+    on the same input is stable.
+    """
+    entries: list[dict] = []
+    i = 0
+    while i < len(content):
+        match = BIB_ENTRY_HEAD.search(content, i)
+        if not match:
+            break
+        entry_type = match.group(1).lower()
+        key = match.group(2)
+        cursor = match.end()
+        if entry_type in ("comment", "preamble", "string"):
+            # Skip to matching closing brace
+            depth = 1
+            while cursor < len(content) and depth > 0:
+                if content[cursor] == "{":
+                    depth += 1
+                elif content[cursor] == "}":
+                    depth -= 1
+                cursor += 1
+            i = cursor
+            continue
+        fields, cursor = _parse_bib_fields(content, cursor)
+        entries.append({"type": entry_type, "key": key, "fields": fields})
+        i = cursor
+    return entries
+
+
+def _parse_bib_fields(content: str, start: int) -> tuple[dict[str, str], int]:
+    """Parse `field = value, field = value, ...}` starting at `start`.
+
+    Returns the field dict plus the offset just after the closing entry brace.
+    """
+    fields: dict[str, str] = {}
+    i = start
+    while i < len(content):
+        # Skip whitespace + commas between fields
+        while i < len(content) and content[i] in " \t\n\r,":
+            i += 1
+        if i >= len(content) or content[i] == "}":
+            return fields, i + 1
+        # Field name
+        name_start = i
+        while i < len(content) and content[i] not in " \t\n\r=":
+            i += 1
+        name = content[name_start:i].strip().lower()
+        # Skip whitespace + `=`
+        while i < len(content) and content[i] in " \t\n\r":
+            i += 1
+        if i >= len(content) or content[i] != "=":
+            # Malformed entry — bail
+            return fields, _skip_to_entry_end(content, i)
+        i += 1
+        while i < len(content) and content[i] in " \t\n\r":
+            i += 1
+        # Field value: `{...}` (balanced), `"..."`, or bare token
+        value, i = _read_bib_value(content, i)
+        if name:
+            fields[name] = value
+    return fields, i
+
+
+def _read_bib_value(content: str, i: int) -> tuple[str, int]:
+    if i >= len(content):
+        return "", i
+    if content[i] == "{":
+        depth = 1
+        i += 1
+        start = i
+        while i < len(content) and depth > 0:
+            if content[i] == "\\" and i + 1 < len(content):
+                i += 2
+                continue
+            if content[i] == "{":
+                depth += 1
+            elif content[i] == "}":
+                depth -= 1
+                if depth == 0:
+                    break
+            i += 1
+        return content[start:i].strip(), i + 1
+    if content[i] == '"':
+        i += 1
+        start = i
+        while i < len(content) and content[i] != '"':
+            if content[i] == "\\" and i + 1 < len(content):
+                i += 2
+                continue
+            i += 1
+        return content[start:i].strip(), i + 1
+    # Bare token (number, string macro reference)
+    start = i
+    while i < len(content) and content[i] not in " \t\n\r,}":
+        i += 1
+    return content[start:i].strip(), i
+
+
+def _skip_to_entry_end(content: str, i: int) -> int:
+    depth = 1
+    while i < len(content) and depth > 0:
+        if content[i] == "{":
+            depth += 1
+        elif content[i] == "}":
+            depth -= 1
+        i += 1
+    return i
+
+
+def parse_bbl(content: str) -> list[dict]:
+    """Parse a rendered `.bbl` into bibitem records.
+
+    Each `\\bibitem[label]{key}` introduces an entry whose rendered text runs
+    until the next `\\bibitem` or end-of-file. `.bbl` has no field structure,
+    so we return `{key, raw}` — the resolver mines DOI/arXiv-ID hints from `raw`.
+    """
+    bibitem = re.compile(r"\\bibitem(?:\[[^\]]*\])?\s*\{([^}]+)\}", re.DOTALL)
+    matches = list(bibitem.finditer(content))
+    out: list[dict] = []
+    for idx, match in enumerate(matches):
+        key = match.group(1).strip()
+        start = match.end()
+        end = matches[idx + 1].start() if idx + 1 < len(matches) else len(content)
+        raw = content[start:end].strip()
+        out.append({"key": key, "raw": raw})
+    return out
+
+
+def parse_doc_references(document_md: str) -> list[str]:
+    """Parse the references section out of Docling-rendered markdown.
+
+    Heuristic: find a heading whose text matches `References` / `Bibliography`
+    / `Citations` (case-insensitive, optional numeric prefix), take everything
+    after it, split on blank lines, drop empty paragraphs.
+    """
+    heading_re = re.compile(
+        r"^\s*#{1,6}\s+(?:\d+\.?\s+)?(?:references|bibliography|citations)\s*$",
+        re.IGNORECASE | re.MULTILINE,
+    )
+    match = heading_re.search(document_md)
+    if not match:
+        return []
+    body = document_md[match.end():]
+    # If a subsequent top-level heading appears, stop there (acknowledgments,
+    # appendices, supplementary). Stop at the first heading at the same level
+    # or shallower than the references heading.
+    next_section = re.search(r"^\s*#{1,6}\s+\S", body, re.MULTILINE)
+    if next_section:
+        body = body[: next_section.start()]
+    # Split on blank lines into paragraphs; trim and drop empties.
+    paragraphs = [p.strip() for p in re.split(r"\n\s*\n", body)]
+    return [p for p in paragraphs if p]
+
+
+def format_bib_citation(fields: dict[str, str]) -> str:
+    """Build a one-line human-readable citation from a parsed `.bib` entry.
+
+    Best-effort — uses what's present (author, year, title, journal, volume, page).
+    Not a publication-quality formatter; just enough to be useful as a reading
+    aid when a downstream agent sees `index.json`'s `citations:` block.
+    """
+    author = _first_author_from_bib_field(fields.get("author", ""))
+    others = ", et al." if " and " in fields.get("author", "") else ""
+    year = fields.get("year", "").strip()
+    title = _clean_bib_text(fields.get("title", "")).strip().rstrip(".")
+    journal = _clean_bib_text(
+        fields.get("journal", "") or fields.get("booktitle", "") or fields.get("howpublished", "")
+    ).strip()
+    volume = fields.get("volume", "").strip()
+    pages = fields.get("pages", "").strip().replace("--", "-")
+    parts = []
+    if author:
+        parts.append(f"{author}{others}")
+    if year:
+        parts.append(f"({year})")
+    if title:
+        parts.append(f"{title}.")
+    if journal:
+        tail = journal
+        if volume:
+            tail += f" {volume}"
+        if pages:
+            tail += f", {pages}"
+        parts.append(tail)
+    return " ".join(parts).strip() or _clean_bib_text(fields.get("note", "")).strip()
+
+
+def _first_author_from_bib_field(author_field: str) -> str:
+    if not author_field:
+        return ""
+    # Split on ' and ' but respect outer braces — `{Planck Collaboration}` stays one author.
+    first = _split_first_author(author_field).strip()
+    # Brace-wrapped single name w/o internal comma: `{Planck Collaboration}` -> "Planck Collaboration".
+    if first.startswith("{") and "," not in _strip_outer_braces(first):
+        return _clean_bib_text(first.strip("{}"))
+    # BibTeX comma form: "Last, First" (incl. `{Abdalla}, Elcio` which IS comma-form
+    # with brace-protected lastname) -> "Last, F."
+    if _has_top_level_comma(first):
+        last, _, rest = _split_at_top_level_comma(first)
+        initials = " ".join(part[0] + "." for part in _clean_bib_text(rest).split() if part)
+        return f"{_clean_bib_text(last).strip()}, {initials}".strip(", ")
+    # "First Last" -> "Last, F."
+    parts = _clean_bib_text(first).split()
+    if len(parts) == 1:
+        return parts[0]
+    last = parts[-1]
+    initials = " ".join(p[0] + "." for p in parts[:-1] if p)
+    return f"{last}, {initials}"
+
+
+def _strip_outer_braces(s: str) -> str:
+    s = s.strip()
+    while s.startswith("{") and s.endswith("}"):
+        s = s[1:-1].strip()
+    return s
+
+
+def _has_top_level_comma(s: str) -> bool:
+    depth = 0
+    for c in s:
+        if c == "{":
+            depth += 1
+        elif c == "}":
+            depth -= 1
+        elif c == "," and depth == 0:
+            return True
+    return False
+
+
+def _split_at_top_level_comma(s: str) -> tuple[str, str, str]:
+    depth = 0
+    for i, c in enumerate(s):
+        if c == "{":
+            depth += 1
+        elif c == "}":
+            depth -= 1
+        elif c == "," and depth == 0:
+            return s[:i], ",", s[i + 1:]
+    return s, "", ""
+
+
+def _split_first_author(author_field: str) -> str:
+    """Return the substring up to the first top-level ` and ` separator."""
+    depth = 0
+    i = 0
+    while i < len(author_field):
+        if author_field[i] == "{":
+            depth += 1
+        elif author_field[i] == "}":
+            depth -= 1
+        elif depth == 0 and author_field[i:i + 5].lower() == " and ":
+            return author_field[:i]
+        i += 1
+    return author_field
+
+
+def _clean_bib_text(text: str) -> str:
+    """Strip the most common BibTeX/LaTeX wrappers so citations read cleanly.
+
+    Not exhaustive — anything we don't recognize passes through verbatim.
+    """
+    if not text:
+        return ""
+    text = re.sub(r"\\(?:textit|textbf|emph|texttt|mbox|protect)\s*\{([^{}]*)\}", r"\1", text)
+    text = re.sub(r"\\(?:url|href)\s*\{[^}]*\}\s*\{?([^{}]*)\}?", r"\1", text)
+    text = text.replace("{\\&}", "&").replace("\\&", "&")
+    text = re.sub(r"\{\\['\"`^~]([a-zA-Z])\}", r"\1", text)  # {\'e} -> e (lossy but readable)
+    text = re.sub(r"\\['\"`^~]\{([a-zA-Z])\}", r"\1", text)
+    text = re.sub(r"[{}]", "", text)
+    return re.sub(r"\s+", " ", text).strip()
+
+
+def _normalize_arxiv_id(raw: str) -> str | None:
+    """Turn an `eprint` field or in-text arXiv id into a clean `YYMM.NNNNN`
+    (new-style) or `field/YYMMNNN` (pre-2007) form. Returns None if no
+    recognizable ID."""
+    raw = raw.strip().lower()
+    raw = re.sub(r"^arxiv:", "", raw)
+    raw = re.sub(r"v\d+$", "", raw)  # drop version
+    if re.match(r"^\d{4}\.\d{4,5}$", raw):
+        return raw
+    if re.match(r"^[a-z\-]+(?:\.[a-z]{2})?/\d{7}$", raw):
+        return raw
+    return None
+
+
+def _doi_from_arxiv(arxiv_id: str) -> str:
+    return f"10.48550/arXiv.{arxiv_id}"
+
+
+def _extract_doi_hints_from_text(text: str) -> tuple[str | None, str | None]:
+    """Mine free-text bibliography rendering for DOI and arXiv ID.
+
+    Returns (doi, arxiv_id), each None when not found. DOI parsing strips trailing
+    punctuation that often follows DOIs in rendered text.
+    """
+    doi = None
+    match = DOI_IN_TEXT.search(text)
+    if match:
+        doi = match.group(0).rstrip(".,;)\"'>")
+    arxiv = None
+    match = ARXIV_ID_IN_TEXT.search(text)
+    if match:
+        arxiv = _normalize_arxiv_id(match.group(1))
+    if not arxiv:
+        match = ARXIV_BARE.search(text)
+        if match:
+            arxiv = match.group(1)
+    return doi, arxiv
+
+
+# Bibitem rendering tends to start with the authors (e.g. "Asgari, M., Lin, C.-A.,...").
+# Year usually follows in `\d{4}` form. This regex is sloppy on purpose — we just want
+# `first-author` for a synthetic key or a Crossref query, not a parser.
+LASTNAME_YEAR = re.compile(r"^([A-Z][A-Za-zÀ-ÿ\-']+).*?\b(19\d{2}|20\d{2})\b", re.DOTALL)
+
+
+def parse_rendered_entry(raw: str) -> dict:
+    """Extract first-author lastname + year from a rendered citation paragraph.
+
+    Returns `{"first_author": str, "year": str, "title_guess": str}` — best effort.
+    Used to build synthetic keys (Path B) and Crossref title queries.
+    """
+    cleaned = _clean_bib_text(raw)
+    match = LASTNAME_YEAR.match(cleaned)
+    first = match.group(1) if match else ""
+    year = match.group(2) if match else ""
+    # Title guess: take the chunk after the year up to next period that isn't an initial.
+    title_guess = ""
+    if year:
+        tail = cleaned.split(year, 1)[1]
+        # Drop a leading delimiter (.,) plus whitespace
+        tail = tail.lstrip(",.: ")
+        # Title ends at the first period followed by a space + capital letter that introduces
+        # journal/volume metadata. Heuristic — good enough for Crossref queries.
+        sentence_end = re.search(r"\.\s+[A-Z]", tail)
+        title_guess = tail[: sentence_end.start()] if sentence_end else tail
+    return {
+        "first_author": first.strip(),
+        "year": year,
+        "title_guess": title_guess.strip().rstrip(".").strip(),
+        "raw_clean": cleaned,
+    }
+
+
+def synth_key(first_author: str, year: str, taken: set[str]) -> str:
+    """Build a unique synthetic key for a Path B entry.
+
+    `<lastname>_<year>`, lowercased. If the name+year pair already exists in
+    `taken`, append a letter suffix (`a`, `b`, ...).
+    """
+    base = re.sub(r"[^a-z0-9]+", "_", (first_author or "anon").lower()).strip("_")
+    if not base:
+        base = "anon"
+    year = year or "ny"
+    candidate = f"{base}_{year}"
+    if candidate not in taken:
+        return candidate
+    for suffix in "abcdefghijklmnopqrstuvwxyz":
+        if f"{candidate}{suffix}" not in taken:
+            return f"{candidate}{suffix}"
+    # 26 collisions is absurd but be safe.
+    counter = 1
+    while f"{candidate}_{counter}" in taken:
+        counter += 1
+    return f"{candidate}_{counter}"
+
+
+# ---------------------------------------------------------------------------
+# DOI resolution
+
+
+class DOIResolver:
+    """Resolve a bibliography entry to a DOI string, with on-disk caching.
+
+    Resolution order, returning the first hit:
+      1. `doi:` field if present in the parsed entry.
+      2. `eprint:` field (or in-text arXiv ID) -> `10.48550/arXiv.<id>`.
+      3. Crossref bibliographic query against the cleaned title + first-author.
+      4. ADS title search (only if `ADS_API_TOKEN` env var or `~/.ads/dev_key` is present).
+
+    Caches `(title, first_author) -> doi` to `cache_path` so re-runs don't re-hit
+    the network. Unresolvable entries cache `None` too — re-running won't retry
+    a known miss (delete the cache to force re-resolution).
+    """
+
+    def __init__(self, cache_path: Path):
+        self.cache_path = cache_path
+        self.cache: dict[str, dict] = {}
+        if cache_path.exists():
+            try:
+                self.cache = json.loads(cache_path.read_text())
+            except (json.JSONDecodeError, OSError):
+                self.cache = {}
+        self.ads_key = self._load_ads_key()
+        self.network_failures = 0
+
+    @staticmethod
+    def _load_ads_key() -> str | None:
+        env = os.environ.get("ADS_API_TOKEN") or os.environ.get("ADS_DEV_KEY")
+        if env:
+            return env.strip()
+        for path in (Path.home() / ".ads" / "dev_key", Path.home() / ".config" / "ads" / "dev_key"):
+            if path.is_file():
+                try:
+                    return path.read_text().strip() or None
+                except OSError:
+                    pass
+        return None
+
+    def resolve(
+        self,
+        title: str,
+        first_author: str,
+        explicit_doi: str | None = None,
+        arxiv_id: str | None = None,
+    ) -> tuple[str | None, str]:
+        """Resolve to a DOI. Returns `(doi-or-None, source-tag)`.
+
+        `source-tag` is one of `doi-field`, `arxiv-eprint`, `crossref`, `ads`, `unresolved`.
+        """
+        if explicit_doi:
+            return self._normalize_doi(explicit_doi), "doi-field"
+        if arxiv_id:
+            return _doi_from_arxiv(arxiv_id), "arxiv-eprint"
+        cache_key = self._cache_key(title, first_author)
+        if cache_key in self.cache:
+            entry = self.cache[cache_key]
+            return entry.get("doi"), entry.get("source", "unresolved")
+        # Network resolution
+        doi, source = self._resolve_via_crossref(title, first_author)
+        if not doi and self.ads_key:
+            doi, source = self._resolve_via_ads(title, first_author)
+        self.cache[cache_key] = {"doi": doi, "source": source, "title": title, "first_author": first_author}
+        return doi, source
+
+    @staticmethod
+    def _normalize_doi(doi: str) -> str:
+        doi = doi.strip()
+        # Strip URL prefix variants
+        doi = re.sub(r"^(?:https?://(?:dx\.)?doi\.org/)", "", doi, flags=re.IGNORECASE)
+        return doi.rstrip(".,;)\"'>")
+
+    @staticmethod
+    def _cache_key(title: str, first_author: str) -> str:
+        digest = hashlib.sha256(
+            f"{title.lower().strip()}||{first_author.lower().strip()}".encode("utf-8")
+        ).hexdigest()
+        return digest[:24]
+
+    def _resolve_via_crossref(self, title: str, first_author: str) -> tuple[str | None, str]:
+        if not title:
+            return None, "unresolved"
+        query = f"{title} {first_author}".strip()
+        url = f"{CROSSREF_API}?query.bibliographic={urllib.parse.quote(query)}&rows=1"
+        try:
+            data = self._http_get_json(url)
+        except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
+            self.network_failures += 1
+            return None, "unresolved"
+        items = ((data or {}).get("message", {}) or {}).get("items", []) or []
+        if not items:
+            return None, "unresolved"
+        top = items[0]
+        candidate_doi = top.get("DOI")
+        candidate_titles = top.get("title") or []
+        if not candidate_doi:
+            return None, "unresolved"
+        # Title-similarity gate: drop noisy hits where the top result clearly isn't
+        # the paper we asked about.
+        if candidate_titles and _title_similarity(title, candidate_titles[0]) < 0.55:
+            return None, "unresolved"
+        return self._normalize_doi(candidate_doi), "crossref"
+
+    def _resolve_via_ads(self, title: str, first_author: str) -> tuple[str | None, str]:
+        if not title:
+            return None, "unresolved"
+        q = f'title:"{title}"'
+        if first_author:
+            q += f' author:"{first_author}"'
+        params = {"q": q, "fl": "doi,title", "rows": "1"}
+        url = f"{ADS_API}?{urllib.parse.urlencode(params)}"
+        try:
+            data = self._http_get_json(
+                url, headers={"Authorization": f"Bearer {self.ads_key}"}
+            )
+        except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
+            self.network_failures += 1
+            return None, "unresolved"
+        docs = ((data or {}).get("response", {}) or {}).get("docs", []) or []
+        if not docs:
+            return None, "unresolved"
+        doi_list = docs[0].get("doi") or []
+        if not doi_list:
+            return None, "unresolved"
+        return self._normalize_doi(doi_list[0]), "ads"
+
+    @staticmethod
+    def _http_get_json(url: str, headers: dict[str, str] | None = None) -> dict:
+        req = urllib.request.Request(url)
+        req.add_header("User-Agent", CROSSREF_USER_AGENT)
+        req.add_header("Accept", "application/json")
+        for key, value in (headers or {}).items():
+            req.add_header(key, value)
+        with urllib.request.urlopen(req, timeout=NETWORK_TIMEOUT_S) as resp:
+            payload = resp.read()
+        return json.loads(payload.decode("utf-8", errors="replace"))
+
+    def save(self) -> None:
+        try:
+            self.cache_path.write_text(json.dumps(self.cache, indent=2, sort_keys=True))
+        except OSError as e:
+            print(f"warn: could not write DOI cache: {e}", file=sys.stderr)
+
+
+def _title_similarity(a: str, b: str) -> float:
+    """Stdlib fuzzy ratio in [0, 1]. Used to filter Crossref hits whose top result
+    isn't actually the queried paper."""
+    a_norm = re.sub(r"\s+", " ", a.lower()).strip()
+    b_norm = re.sub(r"\s+", " ", b.lower()).strip()
+    if not a_norm or not b_norm:
+        return 0.0
+    return SequenceMatcher(None, a_norm, b_norm).ratio()
+
+
+# ---------------------------------------------------------------------------
+# Top-level bibliography pipeline
+
+
+def resolve_bibliography(
+    reference_dir: Path,
+    bib_path: Path | None,
+    bbl_path: Path | None,
+    document_md: Path | None,
+    extracted_citations: dict[str, list[dict]],
+) -> tuple[dict[str, dict], list[str]]:
+    """Build the enriched `citations:` block for `index.json`.
+
+    Joins parsed bibliography entries (from `.bib`, then `.bbl`, then `document.md`)
+    against `extracted_citations` (from `extract_citations()`). Each key maps to
+    `{locations, citation, doi}`; entries the bibliography has but the source
+    never cited are dropped (would otherwise be noise); entries cited but missing
+    from the bibliography keep `citation: null` and `doi: null` and a warning.
+    """
+    warnings: list[str] = []
+    bib_entries: dict[str, dict] = {}  # key -> {citation, fields, raw, doi_hint, arxiv_hint}
+
+    if bib_path and bib_path.is_file():
+        try:
+            parsed = parse_bib(bib_path.read_text(errors="replace"))
+        except OSError as e:
+            warnings.append(f"bibliography: could not read {bib_path.name}: {e}")
+            parsed = []
+        for entry in parsed:
+            fields = entry["fields"]
+            citation = format_bib_citation(fields)
+            doi_hint = fields.get("doi") or fields.get("DOI".lower())
+            arxiv_hint = _normalize_arxiv_id(fields.get("eprint", "") or "") if fields.get("eprint") else None
+            bib_entries[entry["key"]] = {
+                "citation": citation or None,
+                "doi_hint": doi_hint,
+                "arxiv_hint": arxiv_hint,
+                "title": _clean_bib_text(fields.get("title", "")).strip(),
+                "first_author": _first_author_from_bib_field(fields.get("author", "")).split(",")[0],
+                "source": "bib",
+            }
+
+    if bbl_path and bbl_path.is_file() and not bib_entries:
+        # Only fall back to .bbl if no .bib gave us anything.
+        try:
+            parsed_bbl = parse_bbl(bbl_path.read_text(errors="replace"))
+        except OSError as e:
+            warnings.append(f"bibliography: could not read {bbl_path.name}: {e}")
+            parsed_bbl = []
+        for entry in parsed_bbl:
+            cleaned = _clean_bib_text(entry["raw"])
+            doi_hint, arxiv_hint = _extract_doi_hints_from_text(entry["raw"])
+            parsed_rendering = parse_rendered_entry(entry["raw"])
+            bib_entries[entry["key"]] = {
+                "citation": cleaned or None,
+                "doi_hint": doi_hint,
+                "arxiv_hint": arxiv_hint,
+                "title": parsed_rendering["title_guess"],
+                "first_author": parsed_rendering["first_author"],
+                "source": "bbl",
+            }
+
+    # Path B (document.md): synthetic keys.
+    path_b_entries: list[tuple[str, dict]] = []
+    if document_md and document_md.is_file():
+        paragraphs = parse_doc_references(document_md.read_text(errors="replace"))
+        taken: set[str] = set(bib_entries)
+        for raw in paragraphs:
+            parsed_rendering = parse_rendered_entry(raw)
+            doi_hint, arxiv_hint = _extract_doi_hints_from_text(raw)
+            key = synth_key(parsed_rendering["first_author"], parsed_rendering["year"], taken)
+            taken.add(key)
+            path_b_entries.append(
+                (
+                    key,
+                    {
+                        "citation": parsed_rendering["raw_clean"] or None,
+                        "doi_hint": doi_hint,
+                        "arxiv_hint": arxiv_hint,
+                        "title": parsed_rendering["title_guess"],
+                        "first_author": parsed_rendering["first_author"],
+                        "source": "document_md",
+                    },
+                )
+            )
+
+    resolver = DOIResolver(reference_dir / ".doi-cache.json")
+    enriched: dict[str, dict] = {}
+
+    # Path A: enrich entries cited at least once in the source.
+    for key, locations in extracted_citations.items():
+        entry = bib_entries.get(key)
+        if entry is None:
+            warnings.append(
+                f"citation {key}: cited in source but no matching entry in bibliography-source.{{bib,bbl}}"
+            )
+            enriched[key] = {"locations": locations, "citation": None, "doi": None}
+            continue
+        doi, _source = resolver.resolve(
+            entry["title"], entry["first_author"], entry["doi_hint"], entry["arxiv_hint"]
+        )
+        if doi is None:
+            warnings.append(
+                f"citation {key}: could not resolve DOI; tried doi-field, eprint-field, "
+                f"Crossref{', ADS' if resolver.ads_key else ''}"
+            )
+        enriched[key] = {
+            "locations": locations,
+            "citation": entry["citation"],
+            "doi": doi,
+        }
+
+    # Path B: every parsed entry lands in the citations block with empty locations.
+    # (Citation invocations from rendered prose are a separate substrate-extraction
+    # problem we surface in extraction_warnings rather than solve here.)
+    for key, entry in path_b_entries:
+        doi, _source = resolver.resolve(
+            entry["title"], entry["first_author"], entry["doi_hint"], entry["arxiv_hint"]
+        )
+        if doi is None:
+            warnings.append(
+                f"citation {key}: could not resolve DOI; tried doi-field, eprint-field, "
+                f"Crossref{', ADS' if resolver.ads_key else ''}"
+            )
+        enriched[key] = {
+            "locations": [],
+            "citation": entry["citation"],
+            "doi": doi,
+        }
+
+    if path_b_entries:
+        warnings.append(
+            "Path B (Docling fallback): citation invocations in rendered prose are not yet "
+            "extracted; `locations:` is empty for every Path B citation. Bibliography "
+            "entries are still resolved by DOI."
+        )
+
+    resolver.save()
+    if resolver.network_failures:
+        warnings.append(
+            f"bibliography: {resolver.network_failures} network failure(s) during DOI "
+            "resolution; affected entries cached as unresolved — delete .doi-cache.json "
+            "to retry."
+        )
+    return enriched, warnings
+
+
+# ---------------------------------------------------------------------------
+# Path B — Docling fallback
+# ---------------------------------------------------------------------------
+
+
+def extract_path_b(reference_dir: Path) -> dict:
+    """Path B: Docling already produced figures/ + tables/ + metadata.json. Build index from those."""
+    metadata_path = reference_dir / "metadata.json"
+    if not metadata_path.exists():
+        sys.exit(
+            f"error: {metadata_path} not found — Path B requires Docling output. Re-run substrate acquisition."
+        )
+    docling = json.loads(metadata_path.read_text())
+
+    astra_rel = write_astra_yaml_stub(
+        reference_dir, arxiv_id=None, doi=None, title=None, abstract=None
+    )
+
+    document_md = reference_dir / "document.md"
+    citations, bib_warnings = resolve_bibliography(
+        reference_dir,
+        bib_path=None,
+        bbl_path=None,
+        document_md=document_md if document_md.is_file() else None,
+        extracted_citations={},
+    )
+
+    extraction_warnings = [
+        "Path B (Docling fallback): title + abstract + outline not yet extracted from "
+        "document.md; that's a future refinement."
+    ]
+    extraction_warnings.extend(bib_warnings)
+
+    index = {
+        "schema_version": INDEX_SCHEMA_VERSION,
+        "path": "B",
+        "paper_pdf": "paper.pdf" if (reference_dir / "paper.pdf").exists() else None,
+        "paper_tex": None,
+        "source_dir": None,
+        "document_md": "document.md" if document_md.is_file() else None,
+        "bibliography_source_bib": None,
+        "bibliography_source_bbl": None,
+        "astra_yaml": astra_rel,
+        "title": None,  # Future refinement: parse from Docling's markdown
+        "abstract": None,  # Future refinement: parse from Docling's markdown
+        "figures": docling.get("figures", []),
+        "tables": docling.get("tables", []),
+        "outline": [],  # Future refinement: parse Docling's markdown headings
+        "citations": citations,
+        "extraction_warnings": extraction_warnings,
+    }
+    return index
+
+
+# ---------------------------------------------------------------------------
+# Main
+# ---------------------------------------------------------------------------
+
+
+def main() -> None:
+    p = argparse.ArgumentParser(
+        description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
+    )
+    p.add_argument("--reference-dir", type=Path, default=Path("work/reference"))
+    p.add_argument("--arxiv-id", help="arXiv ID, used to populate astra.yaml id and evidence.doi")
+    p.add_argument("--doi", help="paper DOI (used when arXiv ID is unavailable)")
+    args = p.parse_args()
+
+    reference_dir = args.reference_dir
+    if not reference_dir.is_dir():
+        sys.exit(f"error: {reference_dir} not found — run paper-extraction Step 1 first")
+
+    path = detect_path(reference_dir)
+    print(f"detected path: {path}")
+
+    if path == "A":
+        source_dir = reference_dir / "source"
+        tex_files = read_tex_with_origin(source_dir)
+        if not tex_files:
+            sys.exit(f"error: no .tex content found in {source_dir}")
+
+        macros = collect_simple_macros(tex_files)
+        figures, fig_warnings = extract_figures(reference_dir, source_dir, tex_files, macros)
+        tables, tab_warnings = extract_tables(reference_dir, tex_files, source_dir, macros)
+        outline = extract_outline(tex_files, source_dir, macros)
+        raw_citations = extract_citations(tex_files, source_dir)
+        abstract = extract_abstract(tex_files, macros)
+        title = extract_title(tex_files, macros)
+        bib_rel, bbl_rel = copy_embedded_bibliography(reference_dir, source_dir)
+        citations, bib_warnings = resolve_bibliography(
+            reference_dir,
+            bib_path=reference_dir / bib_rel if bib_rel else None,
+            bbl_path=reference_dir / bbl_rel if bbl_rel else None,
+            document_md=None,
+            extracted_citations=raw_citations,
+        )
+        astra_rel = write_astra_yaml_stub(
+            reference_dir, args.arxiv_id, args.doi, title, abstract
+        )
+
+        paper_tex = reference_dir / "paper.tex"
+        index = {
+            "schema_version": INDEX_SCHEMA_VERSION,
+            "path": "A",
+            "paper_pdf": "paper.pdf" if (reference_dir / "paper.pdf").exists() else None,
+            "paper_tex": "paper.tex" if paper_tex.exists() or paper_tex.is_symlink() else None,
+            "source_dir": "source",
+            "document_md": None,
+            "bibliography_source_bib": bib_rel,
+            "bibliography_source_bbl": bbl_rel,
+            "astra_yaml": astra_rel,
+            "title": title,
+            "abstract": abstract,
+            "figures": figures,
+            "tables": tables,
+            "outline": outline,
+            "citations": citations,
+            "extraction_warnings": fig_warnings + tab_warnings + bib_warnings,
+        }
+
+        resolved_dois = sum(1 for entry in citations.values() if entry.get("doi"))
+        print(
+            f"  figures: {len(figures)}, tables: {len(tables)}, "
+            f"sections: {len(outline)}, citation-keys: {len(citations)} "
+            f"({resolved_dois} with DOI), "
+            f"title: {'yes' if title else 'no'}, abstract: {'yes' if abstract else 'no'}, "
+            f"warnings: {len(index['extraction_warnings'])}"
+        )
+    else:
+        index = extract_path_b(reference_dir)
+        print(
+            f"  figures: {len(index['figures'])}, tables: {len(index['tables'])} (from Docling), "
+            f"warnings: {len(index['extraction_warnings'])}"
+        )
+
+    index_path = reference_dir / "index.json"
+    index_path.write_text(json.dumps(index, indent=2))
+    print(f"wrote {index_path}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/claude/lightcone/skills/ralph/SKILL.md b/claude/lightcone/skills/ralph/SKILL.md
new file mode 100644
index 00000000..f65013f2
--- /dev/null
+++ b/claude/lightcone/skills/ralph/SKILL.md
@@ -0,0 +1,195 @@
+---
+name: ralph
+description: >
+  Author a constitution — a markdown document describing a desired state for
+  autonomous iteration — and run a ralph loop against it. The skill covers
+  three modes: drafting a constitution (Study → Draft → Refine → Launch),
+  launching a loop via the bundled tmux runner, and executing a single
+  iteration from inside an active loop (survey → work → update → exit).
+  Use for any work where adaptation matters more than a fixed plan: science,
+  refactoring, exploration, long-running reproductions.
+  Triggers: "ralph", "ralph loop", "constitution", "constitute", "draft a
+  constitution", "launch ralph", "run ralph on <constitution>", "set up a
+  ralph loop".
+---
+
+# Ralph
+
+Long-running iteration toward a desired state. The substrate is a **constitution** — a markdown file describing what "done" looks like. The runner is a **ralph loop** — a tmux session that spawns a fresh worker per iteration with the constitution as system prompt.
+
+Three modes; one applies at a time:
+
+- **Authoring** — drafting a constitution from scratch. See **Authoring** below.
+- **Launching** — outside any active loop, invoking the bundled script to start one on an existing constitution. See **Launching**.
+- **Inside a loop** — the constitution is in the system prompt above; follow the **Loop** protocol. Ignore the other sections; a loop is already running.
+
+**Separation of context: if you author, you do not iterate. If you iterate, you do not author.** Authoring designs the desired state from outside; iterations close the gap from inside. The constitution stays editable across iterations, but the role is set per session.
+
+---
+
+## What a constitution is
+
+A design document with trust built in. Like a governmental constitution, it lays out principles and aspirations — not specific laws, not the current state of affairs. It is designed to outlast any single iteration and remain valid as the world changes around it. **A good constitution never says "50 files remain"** — that's a snapshot that goes stale. It says `check "grep -r 'old_pattern'"` — that's a principle that stays true until the work is done.
+
+Constitutions don't prescribe steps. They describe what the system looks like when it's right — the desired state, in both senses. Whoever works from it surveys reality, reasons about the gap, and decides what's highest value. Each iteration does this with fresh context.
+
+For deeper voice / section guidance and the discipline that keeps a constitution from sliding into a plan, see [`references/constitution.md`](references/constitution.md). For the careful-thinking rhythm that authoring usually wants (two diamonds, six stances, the funnel, the qualitative ambiguity self-check), see [`references/crafting.md`](references/crafting.md).
+
+---
+
+## Authoring
+
+1. **Study** — Read relevant files, understand existing patterns. This informs the *constitution*, not the implementation. The goal is pointers iterations will follow.
+
+2. **Draft** — Create the constitution as a markdown file. Some workflows expect it at a specific path so a runner picks it up (e.g. `/lc-from-paper` writes `constitution.md` at the reproduction workdir root); otherwise put it wherever the work lives. Frontmatter the file with:
+
+   ```yaml
+   ---
+   status: active
+   ---
+   ```
+
+   That's what the launcher checks; it refuses to start otherwise.
+
+3. **Refine** — Show the draft, get feedback, revise. Use `AskUserQuestion` for structured choices. Apply the qualitative ambiguity self-check from [`references/crafting.md`](references/crafting.md) — goal, constraints, success — before launching. Reach for the crafting rhythm and stances when the conversation has careful-thinking character; skip when it doesn't.
+
+4. **Launch** — Hand the constitution to the runner (see **Launching** below). The constitution stays editable while iterations run; each cycle re-reads it, so refinements between iterations are normal.
+
+### What goes in a constitution
+
+A constitution needs enough structure that an iteration landing cold can orient itself, and enough freedom that it can adapt. Common sections — use what fits, skip what doesn't, add what's missing:
+
+```markdown
+## Desired State
+What the system looks like when it's done. Invariants, quality bar,
+done-conditions. Fence the scope — what to aim for AND what to leave alone.
+
+## Context
+File paths, existing patterns, architectural constraints. Things iterations
+need to *find* but not *achieve*.
+
+## Skills
+Which skills to activate before working.
+
+## Evidence
+How to check progress — commands, test suites, grep patterns. Pointers to
+the ground truth that iterations measure themselves against.
+
+## Open Questions
+Uncertainties the user should weigh in on. Iterations add to this; the user
+resolves between loops.
+```
+
+### Authoring principles
+
+- **Constitution, not plan.** Say what the system looks like when it's right. Never describe the current state — anything that becomes false or irrelevant as work progresses doesn't belong. If a section would be outdated after one iteration, it's a snapshot — replace it with a pointer.
+- **Pointers, not snapshots.** "Check `grep -r 'old_pattern'`" not "50 files remain." Snapshots go stale; pointers stay valid across iterations.
+- **Reshape, don't accrete.** When the desired state evolves, rewrite the affected sections so the body still reads as today's desired state. Don't tack on "Round 2" or an "Amendments" appendix. The chronology lives in commits and sibling notes; the body lives in *now*.
+- **Constraints need reasons.** Bare constraints get creatively circumvented. Include enough *why* that an iteration knows when it applies.
+- **Scope is a gift.** A clear fence — "only rename, don't refactor" — saves iterations from well-intentioned drift.
+
+### Authoring anti-patterns
+
+- **Checklists.** "1. Add X, 2. Add Y" — iterations race through without judgment.
+- **Vague done.** "Make it better" — when does iteration stop?
+- **Over-specification.** Prescribing *how* instead of *what*. Trust the agent's taste.
+- **Decision logs / amendment scaffolding.** "Resolved choices", "Round 2", "v2 deltas". Turns the constitution into a process journal. Fold answers into the narrative; let commits carry the chronology.
+
+---
+
+## Launching
+
+The launcher is a shell script bundled with this skill. Inside a project (after `lc init` copies the bundle), its path is:
+
+```
+.claude/skills/ralph/scripts/ralph
+```
+
+Usage:
+
+```
+.claude/skills/ralph/scripts/ralph <constitution.md> [--backend claude|codex] [-- extra-flags...]
+```
+
+- `<constitution.md>` is the constitution file. YAML frontmatter must carry `status: open` or `status: active`; the launcher refuses to start otherwise. Termination is automatic when an iteration flips `status:` to `closed`.
+- The launcher detaches into a tmux session named `ralph-<dirname>-<basename>` and returns immediately. Attach with `tmux attach -t <session>`. A second launch with the same constitution detects the existing session and prints the attach command instead of double-starting.
+
+### Backends
+
+- `claude` (default) — each iteration runs `claude --dangerously-skip-permissions --append-system-prompt <constitution>` with the constitution injected as the system prompt.
+- `codex` — runs `codex --dangerously-bypass-approvals-and-sandbox --config developer_instructions=<constitution>`.
+
+Set with `--backend codex` or `RALPH_BACKEND=codex`.
+
+### Extra flags
+
+Anything after a literal `--` separator forwards to the backend unchanged. Common Claude-backend flags:
+
+- `--chrome` — Claude-in-Chrome integration for iterations that need live browser access.
+- `--model <id>` — override the backend model.
+
+### Examples
+
+```bash
+# Launch on a per-paper reproduction constitution
+.claude/skills/ralph/scripts/ralph constitution.md
+
+# Codex backend
+.claude/skills/ralph/scripts/ralph constitution.md --backend codex
+
+# Claude backend with Chrome integration and a model override
+.claude/skills/ralph/scripts/ralph constitution.md -- --chrome --model claude-opus-4-6
+```
+
+---
+
+## Loop
+
+1. **Survey** — Fresh eyes. Read the constitution and the workdir's `CLAUDE.md`. Check `git log`, glance at sub-fibers or notes the prior iteration left, look at what's actually in the workdir.
+2. **Work** — Stay and work from the vantage point the survey built. Make 1–3 substantial contributions; don't try to clear the queue in one iteration.
+3. **Update** — Before exiting: commit your work; update `CLAUDE.md`'s accumulators (Paper-vs-code disagreements, Open opportunities — whichever the project carries) if anything sharpened; sharpen the constitution body itself if a fact stable enough to belong in *Context* or *Desired State* landed.
+4. **Exit** — `kill $PPID`.
+
+### Earn the vantage point
+
+The survey is a fixed cost; exploit the warm world-model rather than rebuilding it next iteration. Exit when the next valuable move needs a different mental workspace — not when one task ends. If changes so far have been small and runway is plentiful, expand the workspace rather than exit.
+
+**Exit before context is half-full.** Don't wait for "filling" to feel pressing — the right moment is the next sub-task boundary after you cross half. Write the handoff (commits, accumulator updates, constitution sharpening) from full attention and exit; don't try to cram one more thing in. The marginal step you'd squeeze in costs the next iteration more than it saves you, because it pays for the degraded handoff.
+
+### Iteration rules
+
+**State, not checklist.** The constitution describes what "done" looks like. Survey reality, decide what's highest value, work on that.
+
+**Discoverable updates.** Commits, files in the workdir, `CLAUDE.md` accumulators — not progress notes scattered in the body. The next iteration finds what changed by inspecting the system.
+
+**Pointers, not snapshots.** If you learn something stable, update the constitution's *Context* or *Desired State*. Don't leave drive-by notes in the body.
+
+**You have authority.** Trust the constitution. Don't ask permission. Make substantial contributions. Don't avoid ambitious solutions just because they span multiple iterations — the loop continues; tweaks on the next iter are cheap.
+
+**File uncertain decisions** somewhere the user will see them. The convention varies by project: an `open-questions.md` file the constitution points at, an `Open Questions` section in the constitution itself, a `-t question` felt fiber when felt is in use. Don't sediment them in invisible places.
+
+### Long-running jobs
+
+If an iteration kicks off computation (snakemake, cluster jobs, container builds, dev servers), use the `Monitor` tool to stream events from the background process — each stdout line surfaces as a notification, so you'll get pinged when something happens without polling-with-sleep. For one-shot "wait until done," use Bash with `run_in_background` and you'll be notified on completion. Either way, shepherd computation to completion before exiting. Don't fire-and-forget.
+
+### Exit
+
+Closing the constitution (`status: closed` in frontmatter) stops the loop — no further iterations will run. So the closing decision is reserved for a cold survey that finds nothing left to do.
+
+**If you made any changes this iteration, you may not close the constitution.** Commit, update the workdir, `kill $PPID` — let the next iteration survey with fresh eyes and decide whether to close. This is the only hard rule on exit.
+
+Making changes does NOT mean you should exit early. Keep working while the context is warm — make as many changes as belong in this iteration. The rule only constrains *closing the constitution*, not the length of the iteration. See **Earn the vantage point** above.
+
+- **Made changes this iteration** → `kill $PPID` when the warm context is spent. Do not close the constitution.
+- **Survey found zero remaining work AND you made zero changes** → flip the constitution's frontmatter `status:` to `closed`, append a closing summary to the body or a sibling notes file recording what landed, then `kill $PPID`. The launcher's next check fails and the loop terminates.
+
+---
+
+## References
+
+- [`references/constitution.md`](references/constitution.md) — depth on drafting voice, sections, and the discipline that keeps a constitution from drifting into a plan.
+- [`references/crafting.md`](references/crafting.md) — two-diamonds rhythm, six stances, the funnel ledger, and the qualitative ambiguity self-check. Use this when the conversation has careful-thinking character — not every authoring session needs it, but the ones that do are the ones that benefit most.
+
+---
+
+Loop pattern adapted from [Ralph Wiggum](https://ghuntley.com/ralph/).
diff --git a/claude/lightcone/skills/ralph/references/constitution.md b/claude/lightcone/skills/ralph/references/constitution.md
new file mode 100644
index 00000000..39eb28b5
--- /dev/null
+++ b/claude/lightcone/skills/ralph/references/constitution.md
@@ -0,0 +1,133 @@
+# Constitution — depth reference
+
+Drafting a constitution. The SKILL body's **Authoring** section covers the procedural backbone (Study → Draft → Refine → Launch). This reference goes deeper on voice, sections, and the discipline that keeps a constitution from sliding into a plan.
+
+The constitution itself is just a markdown file with YAML frontmatter that a runner reads on each iteration. The bundled runner is `scripts/ralph` (next to this skill); other dispatchers can read the same markdown shape. The runner is interchangeable; the constitution is what matters.
+
+---
+
+## What a constitution is
+
+A constitution is a design document with trust built in. Like a governmental constitution, it lays out principles and aspirations — not specific laws, not the current state of affairs. It is designed to outlast any single iteration and remain valid as the world changes around it.
+
+**A good constitution never says "50 files remain"** — that is a snapshot that goes stale. It says `check "grep -r 'old_pattern'"` — that is a principle that stays true until the work is done.
+
+Constitutions do not prescribe steps. They describe what the system looks like when it is right — the desired state, in both senses of the word. Nothing in the constitution should become confusing or unnecessary as the desired state is reached. Whoever works from it surveys reality, reasons about the gap, and decides what is highest value. Each iteration of the work does this with fresh context.
+
+**Constitution, not plan.** Plans assume you know the path; constitutions trust the agent to find it — with taste, judgment, and fresh eyes each time. This matters most in science and exploratory work, where each decision is informed by the result just before it.
+
+**Separation of context: if you craft, you never do the work yourself.** The constitution is designed by one role; iterations are run by another.
+
+---
+
+## When to write a constitution
+
+- Work where adaptation matters more than a fixed plan: scientific investigation, exploratory refactoring, creative writing.
+- The desired state is clear (or can be made clear) but the path is not.
+- Iterations need to re-read with fresh context and make judgment calls.
+- A checklist would either be wrong after one step or race through without judgment.
+
+Don't write a constitution for: clearly-scoped atomic tasks, anything where a checklist or a plan is genuinely the right shape.
+
+---
+
+## Workflow (deeper)
+
+### 1. Study
+
+Read relevant files, understand existing patterns. This informs the **constitution**, not implementation — the goal is pointers that iterations will follow, not a head start on the work.
+
+### 2. Draft
+
+Create the constitution as a markdown file with `status: active` in YAML frontmatter (that's what the launcher checks). Some workflows expect a specific path so a runner picks it up — e.g. `/lc-from-paper` writes `constitution.md` at the reproduction workdir root. Otherwise put it wherever the work lives. The section block in the SKILL's "What goes in a constitution" is your starting shape; fill what fits, drop what doesn't.
+
+Use the crafting process from [`crafting.md`](crafting.md):
+
+- **Wonder → Ontology:** what IS the desired state? Name it precisely.
+- **Design → Delivery:** what sections does this constitution need? Which are pointers vs snapshots?
+
+Stances that help most during constitution drafting:
+
+- **Ontologist** for naming the desired state ("what IS 'done' here?")
+- **Simplifier** for fencing scope ("what are we explicitly leaving alone?")
+- **Contrarian** for pressure-testing whether the whole framing is right
+- **Architect** when the constitution is about refactoring structure
+
+### 3. Refine
+
+Show the draft, get feedback, revise. Use AskUserQuestion for structured choices. Apply the qualitative ambiguity self-check from `crafting.md` — goal, constraints, success — before launching.
+
+Repeat until it feels solid. It does not have to be complete; open questions belong in the Open Questions section.
+
+### 4. Launch
+
+When approved, hand to a runner. Bundled option: `.claude/skills/ralph/scripts/ralph my-constitution.md`. The runner re-reads the constitution each iteration, so refinements between iterations are normal.
+
+---
+
+## Constitutional sections
+
+A constitution needs enough structure that an iteration landing cold can orient itself, and enough freedom that it can adapt. Common sections — use what fits, skip what does not, add what is missing:
+
+```markdown
+## Desired State
+What the system looks like when it is done. Invariants, quality bar,
+done-conditions. Fence the scope — what to aim for AND what to leave alone.
+
+## Context
+File paths, existing patterns, architectural constraints. Things iterations
+need to *find* but not *achieve*.
+
+## Skills
+Which skills to activate before working.
+
+## Evidence
+How to check progress — commands, test suites, grep patterns. Pointers to
+ground truth that iterations measure themselves against.
+
+## Open Questions
+Uncertainties the user should weigh in on. Iterations add to this; the user
+resolves between loops.
+```
+
+---
+
+## Principles (deeper)
+
+**Pointers, not snapshots.** `check "grep -r 'old_pattern'"` not "50 files remain." Snapshots go stale; pointers stay valid across iterations. This is the constitutional principle: write what remains true until the work is done.
+
+**Reshape, don't accrete.** When the desired state evolves — testing surfaces a gap, a meeting changes the priority, a sibling decision lands — rewrite the affected sections so the body still reads as today's desired state. Don't tack on a "Round 2" section; don't add an "Amendments" appendix; don't keep the old framing alongside the new one as a sediment. A green-field constitution will change a lot as it matures, and a mature one will keep changing as reality does. The chronology lives in the runner's history surface (commits, sibling notes); the body lives in *now*.
+
+**Prefer existing systems.** Before designing anything new: can what is there handle this?
+
+**Constraints need reasons.** Bare constraints get creatively circumvented. Include enough *why* that an iteration knows when it applies.
+
+**Scope is a gift.** A clear fence — "only rename, don't refactor" — saves iterations from well-intentioned drift. Explicit scope frees the agent to work confidently within it.
+
+---
+
+## Constitutions that shape artifacts
+
+Some constitutions do not build code — they shape artifacts like documentation or research narratives. These have different rhythms:
+
+- **The desired state is comprehension, not correctness.** "A reviewer can follow the narrative cold" is harder to test than "all tests pass" — but it is the right bar. Evidence for progress: fewer redundant plots, clearer prose, more natural flow.
+- **The artifact continues to grow.** Unlike a refactoring (which finishes), a research narrative keeps acquiring nodes. The constitution shapes how growth presents itself, not when growth stops.
+
+---
+
+## Anti-patterns
+
+- **Checklists.** "1. Add X, 2. Add Y" — iterations race through without judgment.
+- **Vague done.** "Make it better" — when does iteration stop? What would a reader see?
+- **Over-specification.** Prescribing *how* instead of *what*. Trust the agent's taste.
+- **Snapshot language.** "Currently 50 files" — will be wrong after one iteration.
+- **Immutable seed.** Not our shape. The constitution is meant to be edited between iterations; do not treat it as frozen.
+- **Numerical convergence.** "Iteration stops when similarity ≥ 0.95" — wrong shape for science. Stop when the Evidence section says the desired state has been reached.
+- **Decision logs in the body.** "Resolved choices" / "Decisions made" / "Process notes" sections turn the constitution into a process journal. When a question gets answered (in conversation, via `AskUserQuestion`, in a review), fold the answer into the narrative where it is contextually relevant — into Invariants, Desired State, Context — and let the runner's chronological surface (commits, sibling notes) carry the chronology. The constitution describes *what is*, not *how we got here*; an "Open Questions" section that has been fully resolved should be deleted, not left as a victory log.
+- **Amendment scaffolding.** "Round 2", "v2 deltas", "Updates 2026-05-04 →", "Second round amendments". The same failure as a decision log, played out across edits: the body becomes a sediment of layered framings instead of the current desired state. When the desired state shifts, *reshape* the affected sections — rewrite headings, update prose, drop what no longer applies — so the document still reads as one coherent description of now. The story of how it got here is what commits and the outcome blurb are for.
+
+---
+
+## When crafting lands here
+
+The crafting rhythm in [`crafting.md`](crafting.md) applies to all careful interactive thinking; this reference kicks in when the target artifact is specifically a constitution. The diamonds do most of the work — the funnel mechanic used for open-ended exploration is not the primary move here, because there is already one specific artifact being produced. See the Workflow section above for which stances help most at each drafting phase.
diff --git a/claude/lightcone/skills/ralph/references/crafting.md b/claude/lightcone/skills/ralph/references/crafting.md
new file mode 100644
index 00000000..9bc44cc0
--- /dev/null
+++ b/claude/lightcone/skills/ralph/references/crafting.md
@@ -0,0 +1,181 @@
+# Crafting
+
+How to help the user think through something that hasn't crystallized, and turn the result into structured commitments — fields on an `astra.yaml` if you're inside an analysis (decisions with excluded options, evidence pointers, scoped findings), or inline structure in the constitution itself.
+
+Use it when the user is deciding something non-trivial, scoping a sub-analysis, drafting a living spec, or talking through an open question — any time careful interactive thinking is happening and the output can land in structured form.
+
+The rhythm is two diamonds: first understand what the thing IS, then decide what to DO about it. Each diamond diverges to explore and converges to commit. The ontological question — *what IS this, really?* — is the convergence point of the first diamond, and it is the most practical question you can ask.
+
+```
+    ◇ Wonder              ◇ Design
+   ╱  (diverge)          ╱  (diverge)
+  ╱    surface          ╱    alternatives
+ ╱     questions       ╱     trade-offs
+●─────────────────────●─────────────────────●
+ ╲                     ╲
+  ╲    crystallize      ╲    commit
+   ╲   the name          ╲   with reasons
+    ◇  (converge)         ◇  (converge)
+    Ontology              Delivery
+```
+
+Diamond 1 diverges into questions and converges on a name (*"this IS a decision about covariance estimation"*). Diamond 2 diverges into alternatives and converges on a commit (a default with `excluded_reason` for each rejection). The second diamond inherits the ontological commit from the first.
+
+---
+
+## The two diamonds
+
+### Diamond 1: Wonder → Ontology
+
+**Wonder (diverge).** What are we actually trying to figure out? Surface questions, assumptions, ambiguities. Do not propose answers yet. If the user is already pitching solutions, back them up to the question.
+
+**Ontology (converge).** What IS this, really? Crystallize into a claim, decision, or question specific enough to act on. The convergence is complete when you can **name** the thing precisely — "this is a decision about covariance estimation" or "this is a question about whether leakage matters below ℓ=100." A good name is often the entire output of Diamond 1.
+
+**Output of Diamond 1:** a stub with a real name and at least one structural placeholder — a decision label, an insight claim, or input/output IDs. Not a full block — just the hook that identifies what kind of thing this is.
+
+### Diamond 2: Design → Delivery
+
+**Design (diverge).** What are the real alternatives? For each, what would make it right or wrong? Trade-offs, excluded options, edge cases. This is where the Contrarian and Simplifier stances are most useful.
+
+**Delivery (converge).** Commit to a default, write the `excluded_reason` for each rejected option, identify inputs and outputs, stage the evidence. The structure is now formalizable.
+
+**Output of Diamond 2:** structured fields populated — `decisions` with options and default, `inputs`/`outputs` with IDs and types, `insights` with claim and evidence (in `astra.yaml` or in the constitution itself).
+
+The two diamonds are sequential but the boundary is soft. If you find yourself naming alternatives before the thing is clear, back up to the ontology convergence point. If you converge too early on "this is a decision" when it is actually a question, the Design phase will feel forced — that is the cue to re-enter Wonder.
+
+---
+
+## Stances
+
+Six lightweight lenses for when the conversation needs pressure. **Default is no stance** — straight conversation. Invoke a stance when pressure would help, announce it in one sentence, drop it when it has done its work. Do not stack or pipeline them.
+
+### Socratic — *"What are you assuming?"*
+
+Question-only. Never proposes answers. Surfaces the assumptions under the user's framing.
+
+- What are you assuming is true that might not be?
+- What would make option A right vs option B? What is the actual fork?
+- If you had to write the `excluded_reason` for the option you are about to reject, what would it say?
+
+**Use in Wonder and early Design.** When the user is about to commit to a path and you want the reasons made explicit.
+
+### Ontologist — *"What IS this, really?"*
+
+Pushes on definition before mechanism. Four questions:
+
+1. **Essence** — what is the true nature, stripping away accidental properties?
+2. **Root cause or symptom** — is this the fundamental issue or a surface effect?
+3. **Prerequisites** — what must exist first for this even to make sense?
+4. **Hidden assumptions** — what implicit beliefs is the framing resting on?
+
+**Use at the Ontology convergence point.** When a word is doing heavy lifting and may mean different things in different sentences.
+
+### Contrarian — *"What if the opposite were true?"*
+
+Challenges premises, not details.
+
+- What if the choice does not actually matter for your signal?
+- What if the constraint you are designing around is not real?
+- What if the simplest version is already good enough?
+
+**Use in Design.** When the conversation is burning effort on a distinction that may not matter, or a third option (do nothing, use the default) is being ignored.
+
+### Simplifier — *"Is this complexity earning its keep?"*
+
+YAGNI, concrete first, data over code.
+
+- What can we remove without losing the core value?
+- What is the simplest version that would work?
+- Can a data structure replace this logic?
+
+**Use in Design and early Delivery.** When the design is drifting toward over-engineering or a feature list is growing without anchoring reasons.
+
+### Researcher — *"What do we actually know?"*
+
+Evidence before interpretation. Especially useful for scientific work where a claim needs to be defensible.
+
+- What does the actual source say, not what we remember?
+- What would count as evidence here? What would falsify the claim?
+- What is the most specific claim we can make with the data in hand?
+
+**Use in Delivery.** When an insight needs a defensible claim, or when the user is about to write an outcome that is stronger than the evidence supports.
+
+### Architect — *"If we started over, would we build it this way?"*
+
+Structural root cause. The question behind the question when friction keeps recurring.
+
+- Is the same problem showing up in different forms?
+- Which abstraction does not match reality?
+- What assumption was wrong from the start?
+
+**Use when a debate keeps returning.** The user is circling a decision they have already made three times and cannot stick to — the real question is probably structural, not tactical.
+
+---
+
+## The funnel
+
+When the conversation is exploratory — no single topic, things are accumulating — keep a private running ledger of what is falling out, classified by destination:
+
+| Item kind | What it looks like | Destination |
+|-----------|--------------------|-------------|
+| **Decision** | A choice between real alternatives | `decisions` block in `astra.yaml` / spec |
+| **Finding** | A claim with at least the start of evidence | `findings` block in `astra.yaml` / spec |
+| **Sub-analysis** | "Compute X from Y" with identifiable inputs/outputs | New `astra.yaml` sub-analysis |
+| **Question** | An open thread worth tracking, not yet answered | "Open Questions" section of the constitution |
+| **CLAUDE.md change** | A pattern or gotcha that belongs in project memory | Edit CLAUDE.md |
+
+The ledger is your own working memory. **Do not surface it mid-conversation** unless the user asks or a flush cue fires.
+
+**Flush cues:**
+
+- User says "OK we should write this down" or similar
+- Three or more items have accumulated and the topic is about to shift
+- A natural pause after a decision or finding lands
+
+On flush, present the ledger grouped by destination, then file with the user's assent. If the user declines an item, discard it without argument.
+
+---
+
+## Qualitative ambiguity self-check
+
+Before committing to a path — filing a decision, launching an iteration loop, sealing an outcome — check three things qualitatively. **No scoring, no thresholds.** If any feels fuzzy, resolve it with AskUserQuestion.
+
+1. **Goal.** Is what the user wants specific enough that two competent people would build the same thing from it? If not, what would pin it down?
+2. **Constraints.** Are the limits named? What cannot change, what must be preserved, what would break everything? Missing constraints tend to show up as "oh wait, we also need…" after the commit.
+3. **Success.** How will we know it is done or right? What is the evidence condition? Qualitative is fine ("a reviewer can follow the narrative cold"), but it has to be checkable.
+
+When one is fuzzy, use AskUserQuestion with concrete options rather than open prose questions. Iterate until the answer is "yeah, that's it." **Stop when the fuzziness resolves, not when a score crosses a threshold.** Scores on qualitative priors add false precision; the honest signal is whether the user knows what they want.
+
+This is a mirror, not a gate. If the user wants to file anyway with one dimension still fuzzy, file it — the fuzziness itself can live in an Open Questions section, and future iterations can refine it.
+
+---
+
+## Mapping outputs to structure
+
+What comes out of the diamonds maps onto wherever you keep structured commitments:
+
+| Diamond output | Destination |
+|----------------|-------------|
+| Wonder questions left open | "Open Questions" section in the constitution |
+| Ontology convergence — "this IS a decision about X" | A `decisions.<key>.label` entry — in `astra.yaml` or in the constitution body |
+| Design alternatives with trade-offs | `decisions.<key>.options`; rejected options get `excluded_reason` |
+| Delivery — the commit | `decisions.<key>.default` |
+| Finding at end of Delivery | `findings.<key>` with `claim` + `evidence` (or in `astra.yaml`) |
+| Sub-analysis scope | New sub-analysis in `astra.yaml` |
+| Process-level lesson that generalizes | Edit to project CLAUDE.md |
+
+The same shapes apply directly inline in `astra.yaml` or the constitution itself; no separate substrate is required.
+
+---
+
+## Anti-patterns
+
+- **Ambiguity gates.** Do not withhold help until the user clarifies N dimensions. The self-check is a mirror, not a door.
+- **Numerical scoring.** Do not introduce 0–1 clarity scores with thresholds. The underlying signal is qualitative and the number adds false precision.
+- **Stance pipelines.** Do not run Socratic → Ontologist → Contrarian in sequence. Pick one when it helps; drop it when it has.
+- **Mandatory interview.** No prepared question list. Stances are responsive to the actual conversation.
+- **Surfacing the ledger too early.** A single item is not a flush. Wait for accumulation or a pause.
+- **Immutable outputs.** Nothing filed here is locked. Everything is editable; reversals are normal.
+- **Nine-minds overload.** Six stances is already generous. Add more only when a specific gap shows up, never preemptively.
+- **Interrogation without a ceiling.** Three questions is usually enough. If the user is getting irritated, stop asking and file what you have.
+- **Converging before the name is clear.** If Diamond 2 feels forced, Diamond 1 has not finished. Back up.
diff --git a/claude/lightcone/skills/ralph/scripts/ralph b/claude/lightcone/skills/ralph/scripts/ralph
new file mode 100755
index 00000000..9625e8d1
--- /dev/null
+++ b/claude/lightcone/skills/ralph/scripts/ralph
@@ -0,0 +1,145 @@
+#!/bin/bash
+# Run a ralph loop on a constitution file.
+#
+# Loops while the constitution's YAML frontmatter `status:` is `open` or
+# `active`. Each iteration starts a fresh Claude (or Codex) session with
+# the constitution injected as the system prompt; the worker surveys,
+# works, commits, and exits via `kill $PPID`. Termination is by an
+# iteration flipping `status:` to `closed` on a cold survey.
+#
+# Usage:
+#   ralph <constitution.md> [--backend claude|codex] [-- extra-flags...]
+#
+# Default backend: claude. Override with --backend codex or RALPH_BACKEND=codex.
+
+set -e
+
+SPEC_FILE="${1:?Usage: ralph <constitution.md> [--backend claude|codex] [-- extra-flags...]}"
+shift
+
+BACKEND="${RALPH_BACKEND:-claude}"
+if [[ "$1" == "--backend" ]]; then
+    BACKEND="$2"
+    shift 2
+fi
+
+EXTRA_FLAGS=""
+if [[ "$1" == "--" ]]; then
+    shift
+    EXTRA_FLAGS="$*"
+fi
+
+# Resolve to absolute path
+SPEC_FILE="$(cd "$(dirname "$SPEC_FILE")" && pwd)/$(basename "$SPEC_FILE")"
+
+if [[ ! -f "$SPEC_FILE" ]]; then
+    echo "Constitution file not found: $SPEC_FILE"
+    exit 1
+fi
+
+SESSION="ralph-$(basename "$(dirname "$SPEC_FILE")")-$(basename "$SPEC_FILE" .md)"
+WORK_DIR="$(dirname "$SPEC_FILE")"
+
+# Anchor status check to the YAML frontmatter so body prose describing
+# this very check ("status: open|active") can't self-match.
+check_status() {
+    head -50 "$SPEC_FILE" | sed -n '/^---$/,/^---$/p' | grep -qiE '^status:[[:space:]]*(open|active)'
+}
+
+if ! check_status; then
+    echo "Constitution $SPEC_FILE must have YAML frontmatter status: open or active."
+    echo "  Fix: add"
+    echo "         ---"
+    echo "         status: active"
+    echo "         ---"
+    echo "       at the top of the file."
+    exit 1
+fi
+
+# Refuse to double-launch
+if tmux has-session -t "$SESSION" 2>/dev/null; then
+    echo "Ralph already running: $SESSION"
+    echo "  Attach: tmux attach -t $SESSION"
+    exit 0
+fi
+
+# Write loop script to temp file (avoids heredoc quoting hell)
+LOOP_SCRIPT=$(mktemp "${TMPDIR:-/tmp}/ralph-loop.XXXXXX")
+cat > "$LOOP_SCRIPT" << 'LOOP'
+#!/bin/bash
+SPEC_FILE="$1"
+WORK_DIR="$2"
+BACKEND="$3"
+EXTRA_FLAGS="$4"
+
+iteration=0
+
+check_status() {
+    head -50 "$SPEC_FILE" | sed -n '/^---$/,/^---$/p' | grep -qiE '^status:[[:space:]]*(open|active)'
+}
+
+while check_status; do
+    cd "$WORK_DIR"
+    iteration=$((iteration + 1))
+    echo ""
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+    echo "Ralph iteration $iteration — $(date '+%H:%M:%S')"
+    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+
+    SPEC_CONTENT=$(cat "$SPEC_FILE")
+
+    SYSPROMPT_FILE=$(mktemp "${TMPDIR:-/tmp}/ralph-sys.XXXXXX")
+    PROMPT_FILE=$(mktemp "${TMPDIR:-/tmp}/ralph-prompt.XXXXXX")
+
+    cat > "$SYSPROMPT_FILE" << SYSEOF
+Ralph iteration $iteration. Constitution: $SPEC_FILE
+
+$SPEC_CONTENT
+SYSEOF
+
+    cat > "$PROMPT_FILE" << 'PROMPTEOF'
+You are inside a ralph loop — meditative iteration toward a desired state. Activate the ralph skill and follow its Loop protocol against the constitution above. The workdir's CLAUDE.md auto-loads; read it on entry.
+PROMPTEOF
+
+    PROMPT=$(cat "$PROMPT_FILE")
+
+    if [[ "$BACKEND" == "codex" ]]; then
+        codex --dangerously-bypass-approvals-and-sandbox \
+            --config "developer_instructions=$(cat "$SYSPROMPT_FILE")" \
+            $EXTRA_FLAGS \
+            "$PROMPT"
+    else
+        claude --dangerously-skip-permissions \
+            $EXTRA_FLAGS \
+            --append-system-prompt "$(cat "$SYSPROMPT_FILE")" \
+            <<< "$PROMPT"
+    fi
+
+    rm -f "$SYSPROMPT_FILE" "$PROMPT_FILE"
+
+    echo "--- Iteration complete ---"
+    sleep 2
+done
+
+echo ""
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo "Ralph complete — $iteration iterations"
+echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo ""
+echo "Session kept open for inspection. Type exit to close."
+exec bash -l
+LOOP
+
+chmod +x "$LOOP_SCRIPT"
+
+echo "Starting ralph on $SPEC_FILE"
+echo "  Backend:  $BACKEND"
+echo "  Work dir: $WORK_DIR"
+[[ -n "$EXTRA_FLAGS" ]] && echo "  Flags:    $EXTRA_FLAGS"
+
+# Launch tmux with a login shell running the loop script
+tmux new-session -d -s "$SESSION" -c "$WORK_DIR" \
+    bash -l "$LOOP_SCRIPT" "$SPEC_FILE" "$WORK_DIR" "$BACKEND" "$EXTRA_FLAGS"
+
+echo "  Session:  $SESSION"
+echo "  Attach:   tmux attach -t $SESSION"
diff --git a/claude/lightcone/templates/CLAUDE.md b/claude/lightcone/templates/CLAUDE.md
index 325f81e9..ecfd8428 100644
--- a/claude/lightcone/templates/CLAUDE.md
+++ b/claude/lightcone/templates/CLAUDE.md
@@ -2,10 +2,7 @@
 
 ASTRA analysis project, orchestrated by lightcone-cli.
 
-**Source of truth:**
-- `astra.yaml` — the analysis specification
-- `.claude/guides/astra-reference.md` — astra.yaml spec syntax
-- `.claude/guides/lightcone-cli-reference.md` — `lc` CLI commands, workflow, status, failures
+The single source of truth for this analysis is `astra.yaml`. Spec syntax and CLI workflow live in the `/astra` and `/lc-cli` reference skills (named in the session-start primer; invoke when you need depth).
 
 ### Quick Start
 
@@ -16,7 +13,7 @@ lc verify                 # check provenance integrity
 
 ### Keep astra.yaml and code in sync
 
-`astra.yaml` and the code must never diverge. When you change one, update the other in the same edit and run `astra validate astra.yaml`. See `lightcone-cli-reference.md` → "Spec-Code Invariant" for the full rules.
+`astra.yaml` and the code must never diverge. When you change one, update the other in the same edit and run `astra validate astra.yaml`.
 
 ---
 
diff --git a/docs/architecture.md b/docs/architecture.md
index 6122f3d1..d1858130 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -272,7 +272,7 @@ warnings.
 | `.lightcone/Snakefile` | Project (generated) | Auto-generated by `lc run`. Don't edit. |
 | `.lightcone/snakefile-config.json` | Project (generated) | Per-`(rule, universe)` config. |
 | `.lightcone/lightcone.yaml` | Project | Tiny scratchpad — currently writes only `target: local`. Not consumed by today's code. |
-| `~/.lightcone/config.yaml` | User | `container.runtime` (and historically `extraction_model`). |
+| `~/.lightcone/config.yaml` | User | `container.runtime`. |
 | `.claude/settings.json` | Project | Claude Code permissions. |
 
 The `dagster.yaml` and `~/.lightcone/targets/*.yaml` files referenced in
diff --git a/docs/cli/init.md b/docs/cli/init.md
index 4db4979f..045d7a50 100644
--- a/docs/cli/init.md
+++ b/docs/cli/init.md
@@ -23,7 +23,7 @@ CLAUDE.md                     # short note pointing future agents at the project
 results/                      # placeholder; populated by `lc run`
 universes/                    # placeholder; populate via `astra universe generate -n …`
 .claude/                      # bundled Claude Code plugin
-  skills/, agents/, hooks/, scripts/, guides/, templates/
+  skills/, agents/, hooks/, scripts/, templates/
   settings.json               # the chosen permission tier
 .venv/                        # Python venv (skipped with --no-venv)
 ```
@@ -41,7 +41,7 @@ universes/                    # placeholder; populate via `astra universe genera
 > The historical `--target`, `--existing-project`, and `--sub-analysis`
 > flags have been removed; today's `lc init` only knows the three flags
 > above. For migrating an existing project, run `lc init` in a fresh
-> directory and use the `/lc-migrate` skill from inside Claude Code.
+> directory and use the `/lc-from-code` skill from inside Claude Code.
 
 ## Permission tiers
 
@@ -70,7 +70,7 @@ lc init . --permissions yolo           # for autonomous loops you trust
 cd my-analysis
 claude           # open Claude Code
 # Inside Claude Code:
-/lc-new          # scope a research question into astra.yaml
-/lc-build        # implement and run it
-/lc-verify       # audit the result
+/lc-new  # scope a research question into astra.yaml
+# Then ask the agent to implement the spec.
+# It will run lc run, watch lc status, then validate and verify.
 ```
diff --git a/docs/index.md b/docs/index.md
index 0d3521d9..683a1302 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -72,11 +72,12 @@ src/lightcone/                  # PEP 420 namespace package — NO __init__.py
 src/snakemake_executor_plugin_dask/   # Snakemake executor → dask.distributed
 
 claude/lightcone/               # Claude Code plugin (force-included into the wheel)
-├── skills/                     # lc-new, lc-build, lc-verify, lc-migrate, lc-feedback
+├── skills/                     # lc-new, lc-from-code, lc-from-paper,
+│                                # lc-feedback, ralph (+ bundle siblings);
+│                                # reference skills: astra, lc-cli
 ├── agents/                     # lc-extractor (literature subagent)
-├── guides/                     # astra-reference, lightcone-cli-reference, ui-brand
 ├── templates/                  # project CLAUDE.md template
-└── scripts/                    # session hooks (bash): venv, validate-on-save, …
+└── scripts/                    # session hooks (bash): venv, validate-on-save, session-start primer
 
 tests/                          # pytest, mirrors src/
 pyproject.toml                  # hatchling + hatch-vcs; ASTRA + Snakemake as deps
@@ -145,5 +146,5 @@ just docs-serve     # live docs preview
 - [Architecture](architecture.md) — the full execution and integrity story
 - [CLI Reference](cli/index.md) — every command currently shipped
 - [Python API](api/index.md) — the engine modules
-- [Skills](skills/index.md) — what each `/lc-*` skill is supposed to do
+- [Skills](skills/index.md) — what each `/lc-*` skill does (including the `/lc-from-*` family)
 - [Contributing](contributing/setup.md) — getting the dev loop running
diff --git a/docs/skills/authoring.md b/docs/skills/authoring.md
index 1f4ecade..5fbfbf72 100644
--- a/docs/skills/authoring.md
+++ b/docs/skills/authoring.md
@@ -38,31 +38,31 @@ argument-hint: "[OPTIONAL ARG] [--flag VALUE]"
 
 ## Body conventions
 
-Follow [`claude/lightcone/guides/ui-brand.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/guides/ui-brand.md):
-
 - `##` for phase headings; lead with a "Stage banner" line that the
   skill prints to the chat.
-- `✓ / ○ / ✗` for status; never emojis except inside the agent's own
-  branded output.
-- Action prompts in bold sentences (`> "What are you trying to learn?"`).
+- `✓ / ○ / ✗` for status. Skip emoji elsewhere — they belong only
+  inside the agent's own branded banner output.
+- Action prompts in blockquotes (`> "What are you trying to learn?"`).
 - A `## Restrictions` (or `## Hard rules`) section at the end listing
   invariants Claude must not break.
 
-## Referencing guide files
+## Referencing reference skills
 
-Guides live alongside the skills:
+Spec and CLI reference content live in their own skills — `/astra` and
+`/lc-cli` — so any skill needing depth can invoke them directly:
 
 ```markdown
-Before starting, read `.claude/guides/astra-reference.md` for the
-spec, and `.claude/guides/lightcone-cli-reference.md` for the CLI.
+Invoke `/astra` and read the Decisions section before classifying
+candidate decisions, and `/lc-cli` for the Spec-Code Invariant rules.
 ```
 
-The plugin layout means these paths are stable across both bundled
-(installed-package) and dev (in-repo) modes.
+Both are named in the session-start primer so they're discoverable
+from the first turn; explicit invocation in a skill body is the right
+call when a specific section is load-bearing for that skill's work.
 
 ## Spawning subagents
 
-Use `Task` with `subagent_type` to delegate work. The
+Use `Agent` with `subagent_type` to delegate work. The
 `lc-extractor` subagent in `agents/` is the canonical example:
 
 ```python
@@ -79,9 +79,9 @@ Spawn agents in parallel by issuing them in a single tool-use block.
 
 The `evals/` tree has fixtures (currently `evals/tasks/snae/`) and the
 runner lives at `lightcone.eval.harness`. Eval CLI commands are defined
-in `lightcone.eval.cli` (`lc eval run|report|compare`), but **note that
-this group is currently not wired into the top-level `lc` CLI** — see
-the [maintainer summary](../index.md) for status. To run evals
+in `lightcone.eval.cli` and registered as `lc eval run|report|compare`
+when the optional `eval` extra is installed (the registration is
+gated on `ImportError` in `lightcone.cli.commands`). To run evals
 programmatically:
 
 ```python
@@ -92,19 +92,7 @@ from lightcone.eval.cli import run_cmd
 
 ## Installing changes into an existing project
 
-`lc init` copies the plugin once. To pull updated skills into an
-existing project after editing them:
-
-```bash
-python - <<'PY'
-import shutil
-from pathlib import Path
-from lightcone.cli.plugin import get_plugin_source_dir
-src = get_plugin_source_dir()
-dst = Path(".claude/skills")
-if dst.exists(): shutil.rmtree(dst)
-shutil.copytree(src / "skills", dst)
-PY
-```
-
-(See [`lc update`](../cli/update.md) for the longer story.)
+`lc init` copies the plugin once and refuses to run a second time on
+the same directory. See [Updating an existing project](../cli/update.md)
+for the Python heredoc that resyncs all the plugin subdirs (`skills`,
+`agents`, `scripts`, `guides`, `templates`) into an existing project.
diff --git a/docs/skills/check-sentence-by-sentence.md b/docs/skills/check-sentence-by-sentence.md
new file mode 100644
index 00000000..4df74af0
--- /dev/null
+++ b/docs/skills/check-sentence-by-sentence.md
@@ -0,0 +1,117 @@
+# /check-sentence-by-sentence
+
+Sentence-by-sentence audit of a paper against an ASTRA project's code.
+For every claim about implementation or results in the methodology,
+results, discussion, and appendices, locate the corresponding code
+(`file:line`) or mark `NOT FOUND`. The agent does **not** run any
+code — this is a static reading audit.
+
+Source: [`claude/lightcone/skills/check-sentence-by-sentence/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/check-sentence-by-sentence/SKILL.md).
+
+Argument hint: `[path to paper source, e.g. work/reference/source/main.tex or work/reference/document.md]`.
+
+## Allowed tools
+
+```
+Read, Glob, Grep,
+Bash(ls:*), Bash(wc:*), Bash(grep:*), Bash(find:*),
+AskUserQuestion, Agent
+```
+
+Read-only over both the paper source and the project code. No
+execution.
+
+## Setup
+
+1. **Confirm project root.** `astra.yaml` in cwd, or ask the user to
+   `cd` to the ASTRA project.
+2. **Confirm paper source.** Resolve in order:
+   - A `.tex` argument → `tex` mode.
+   - A directory argument → look for `<dir>/source/` (TeX), then
+     `<dir>/document.md` (markdown).
+   - No argument → prefer the lc-from-paper layout:
+     `work/reference/source/<main>.tex` (Path A) or
+     `work/reference/document.md` (Path B, Docling/Pandoc fallback).
+   - Legacy `.tex` locations in cwd as a last resort.
+
+Don't audit PDFs directly — if only `work/reference/paper.pdf` exists,
+ask the user to run paper extraction first.
+
+## Section enumeration
+
+The main agent walks the source carefully to enumerate sections.
+
+- **`tex` mode** — build an ordered audit source list by following
+  local `\input{...}` / `\include{...}` from the main TeX file (one
+  level deep). For each file, `grep -n` for `^\\section`,
+  `^\\subsection`, and `^\\appendix`. Many arXiv papers keep prose
+  outside the main wrapper, so the included files carry most audit
+  units.
+- **`markdown` mode** — `grep -n` for `^#`, `^##`, etc. in
+  `document.md`. Heading depth maps to TeX section/subsection.
+
+Audit-relevant sections: methodology, results, discussion,
+appendices. Skip abstract, introduction, acknowledgements,
+references, author lists.
+
+Each leaf (sub)section becomes one sub-agent. A section with
+subsections spawns one sub-agent per subsection, plus optionally one
+more for any pre-subsection prose span. Issue them in a single
+tool-use block so they run in parallel.
+
+## Per-sub-agent output
+
+Each sub-agent reads its assigned line range, splits into sentences,
+keeps the claim-bearing ones, and returns:
+
+```
+[
+  {"quote": "...", "location": "scripts/foo.py:142", "note": "..."},
+  {"quote": "...", "location": "NOT FOUND", "note": "..."},
+  ...
+]
+```
+
+`note` is optional, under 10 words, used for nuance like "approximate
+match", "different constant", "value computed at runtime".
+
+## Aggregation: two filtering passes
+
+Sub-agents are deliberately generous about what they keep. The main
+agent then:
+
+1. **Drops non-computational sentences** — framing / motivation
+   ("the first step is..."), pure prose that doesn't correspond to
+   anything you'd expect in code.
+2. **Merges duplicates** — when the same claim is asserted in multiple
+   places, collapse to a single entry pointing at the canonical
+   location.
+
+The final report is paper-order: methodology → results → discussion →
+appendices, with each entry's `quote`, `location`, and `note`.
+
+## Hard rules
+
+- **No execution.** Numerical results can be located at the line that
+  computes them, but agreement isn't verifiable here. Use a note like
+  "value computed at runtime".
+- **Quote verbatim.** Trim to one sentence; long sentences may keep
+  just the claim-bearing clause.
+- **`file:line` is specific.** The function call, parameter assignment,
+  or computed value — not just a file.
+- **Read only the assigned line range.** Each sub-agent stays inside
+  its window.
+
+## When to invoke
+
+- From `/lc-from-paper`'s REVIEW close-out (opt-in).
+- Standalone, any time, to spot-check fidelity claim by claim.
+
+## Related
+
+- [`/lc-from-paper`](lc-from-paper.md) — invokes
+  `/check-sentence-by-sentence` during REVIEW (opt-in).
+- [`/figure-comparison`](figure-comparison.md) — the other REVIEW
+  close-out, artifact-vs-artifact rather than paper-vs-code.
+- [`/paper-extraction`](paper-extraction.md) — produces the paper
+  substrate this skill reads.
diff --git a/docs/skills/figure-comparison.md b/docs/skills/figure-comparison.md
new file mode 100644
index 00000000..3b2ad9df
--- /dev/null
+++ b/docs/skills/figure-comparison.md
@@ -0,0 +1,87 @@
+# /figure-comparison
+
+Build a self-contained HTML report (`.lightcone/comparison.html`) that
+places paper reference artifacts on the left and reproduced artifacts
+on the right, with red flags wherever a counterpart is missing. Images
+are base64-embedded so the HTML is portable. Run from a project folder
+containing `astra.yaml`.
+
+Source: [`claude/lightcone/skills/figure-comparison/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/figure-comparison/SKILL.md).
+
+Argument hint: `[path to paper reference dir, e.g. work/reference/]`.
+
+## Allowed tools
+
+```
+Read, Write, Glob, Grep,
+Bash(ls:*), Bash(wc:*), Bash(grep:*), Bash(find:*), Bash(file:*),
+Bash(python3:*), Bash(python:*), Bash(base64:*),
+AskUserQuestion, Agent
+```
+
+Read-only over the build artifacts. The skill never invokes the
+pipeline itself — if `results/<universe>/` is empty, it tells the user
+to run `lc run` first and stops.
+
+## Setup
+
+1. **Confirm project root.** Reads `astra.yaml` in the cwd. If missing,
+   asks the user to `cd` to the ASTRA project.
+2. **Confirm results exist.** Default universe is `baseline`, unless
+   `comparison-report.yaml` names another universe or the user
+   supplied one. Checks `ls results/<universe>/`.
+3. **Locate the paper reference substrate.** In order: a path passed as
+   an argument, then `work/reference/` from lc-from-paper's layout
+   (`source/` for arXiv TeX, `document.md` for the Docling fallback,
+   plus extracted `figures/` and `tables/`). Legacy locations are
+   tried only after lc-from-paper paths fail.
+
+## Scope resolution
+
+The skill picks its target set in priority order:
+
+1. **`comparison-report.yaml`** — the highest-priority scope when
+   lc-from-paper has run COMPARE. Records exactly what to compare,
+   including `type`, `priority`, paper/reproduced values, file paths,
+   and match status.
+2. **`targets/targets.md`** — the SPECIFY-phase scope ledger, used
+   when COMPARE hasn't run yet.
+3. **Default paper-driven flow** — when neither scope file exists,
+   builds a best-effort report from `astra.yaml`'s narrative and
+   findings plus `work/reference/`.
+
+## Output
+
+A single `.lightcone/comparison.html` with paper artifacts on the left
+and reproduced artifacts on the right. Helper scripts and intermediate
+manifests also live under `.lightcone/` so they don't pollute the
+baseline results.
+
+The HTML embeds figure images as base64 — paste it into email, drop
+it on a shared drive, or send it through Slack without breaking links.
+
+## When to invoke
+
+- From `/lc-from-paper`'s REVIEW close-out (mandatory).
+- Standalone, any time after `lc run` succeeds, to see how the
+  reproduction stacks up against the paper.
+
+## Hard rules
+
+- **Read-only over build artifacts.** Never run the pipeline; if
+  outputs are missing, stop and ask the user to build first.
+- **Don't compare directly against a whole PDF.** When only
+  `work/reference/paper.pdf` exists, ask the user to run paper
+  extraction first.
+- **Preserve scope ordering.** `comparison-report.yaml` wins over
+  `targets/targets.md` wins over the default flow.
+
+## Related
+
+- [`/lc-from-paper`](lc-from-paper.md) — invokes `/figure-comparison`
+  during REVIEW (mandatory).
+- [`/paper-extraction`](paper-extraction.md) — produces the
+  `work/reference/` substrate this skill reads.
+- [`/check-sentence-by-sentence`](check-sentence-by-sentence.md) —
+  the other REVIEW close-out, paper-vs-code rather than
+  artifact-vs-artifact.
diff --git a/docs/skills/index.md b/docs/skills/index.md
index 5c933101..e1c352b7 100644
--- a/docs/skills/index.md
+++ b/docs/skills/index.md
@@ -1,22 +1,57 @@
 # Skills
 
 Skills are Claude Code slash commands bundled in the lightcone-cli
-plugin. They give the agent a structured, phase-by-phase workflow for
-the most common research operations.
+plugin. Each shapes the agent's workflow around a recurring research
+operation: scoping an analysis, wrapping existing code, reproducing
+a paper.
 
-If you're a researcher trying to *use* these, the
-[Claude Code Workflow](../user/agent-workflow.md) page in the user
-guide is the friendly version. This page is for maintainers.
+If you want to *use* these, start with
+[The Agentic Workflow](../user/agent-workflow.md) in the user guide.
+This page is for maintainers.
 
 ## Available skills
 
+The `/lc-from-*` family is parallel in what you start from: a question,
+code, or a paper. `/lc-from-paper` is the entry point of a six-skill
+paper-reproduction bundle; the five siblings stand alone and are
+user-invokable directly.
+
+### Project lifecycle
+
 | Skill | Command | Purpose |
 |-------|---------|---------|
 | [lc-new](lc-new.md) | `/lc-new` | Scope a research question into an `astra.yaml`, with optional literature extraction. |
-| [lc-build](lc-build.md) | `/lc-build` | Plan + autonomous loop until all outputs in a universe materialize. |
-| [lc-verify](lc-verify.md) | `/lc-verify` | Read-only audit: spec validity, materialization status, decision-code alignment, result file shapes. |
-| [lc-migrate](lc-migrate.md) | `/lc-migrate` | Wrap an existing codebase in ASTRA: scan, generate spec, parameterize, run. |
+| [lc-from-code](lc-from-code.md) | `/lc-from-code` | Wrap an existing codebase in ASTRA: scan, generate spec, parameterize, run. |
+| [lc-from-paper](lc-from-paper.md) | `/lc-from-paper` | Reproduce a published paper in ASTRA — ORIENT-first driver that hands off to a ralph loop for the long middle. |
 | [lc-feedback](lc-feedback.md) | `/lc-feedback` | File a GitHub issue against the right Lightcone repo with auto-collected context. |
+| [ralph](ralph.md) | `/ralph` | Author a constitution and run a ralph loop against it. Used by `lc-from-paper` for the long middle; standalone for any other long-running work. |
+
+### Paper-reproduction bundle (sibling skills)
+
+Co-located with `lc-from-paper` so a single `lc init` brings the full
+toolkit. Each stands alone and is user-invokable; `lc-from-paper`
+dispatches them by role during the reproduction.
+
+| Skill | Command | Purpose |
+|-------|---------|---------|
+| [ralph](ralph.md) | `/ralph` | Loop substrate. `lc-from-paper`'s ORIENT invokes ralph's Authoring mode to draft the per-paper constitution; the loop launcher hands off after ORIENT lands; each iteration runs ralph's Loop protocol. Also user-invokable standalone (see the Project lifecycle row above). |
+| [paper-extraction](paper-extraction.md) | `/paper-extraction` | Turn an arXiv ID or DOI into a standardized `work/reference/` directory: substrate, figures, tables, citations (with resolved DOIs), and a stub `astra.yaml`. |
+| [narrative](narrative.md) | `/narrative` | Author the `narrative:` prose and decision `rationale:` against an existing `astra.yaml`, in paper-reproduction, retrofit, or co-drafting mode. |
+| [figure-comparison](figure-comparison.md) | `/figure-comparison` | Build a self-contained HTML side-by-side: paper figures, tables, and numerics vs reproduced artifacts. |
+| [check-sentence-by-sentence](check-sentence-by-sentence.md) | `/check-sentence-by-sentence` | Static audit of paper claims against code locations (`file:line` or `NOT FOUND`). |
+
+See the [bundle README](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/README.md) for the rationale behind co-location vs plugin install.
+
+### Reference skills (auto-primed via session-start)
+
+Not entry points. Other skills invoke them — or Claude does, when a deeper reference would help — to load reference content into the working session. The session-start hook names both in its primer, so Claude knows they exist from the first turn.
+
+| Skill | Command | Purpose |
+|-------|---------|---------|
+| `astra` | `/astra` | Reference for the `astra.yaml` spec: structure, decisions, options, prior insights, findings, evidence, sub-analyses, narrative anchors, composition mechanics. |
+| `lc-cli` | `/lc-cli` | Reference for `lc` workflow: commands, the Spec-Code Invariant, status interpretation, failure diagnosis, multiverse runs, publishing via WRROC. |
+
+These intentionally stay out of the top-level README. Researchers use the project-lifecycle skills directly; the reference skills are infrastructure.
 
 ## How a skill is wired
 
@@ -25,48 +60,53 @@ YAML frontmatter:
 
 ```yaml
 ---
-name: lc-build
+name: lc-new
 description: >
-  Build an ASTRA analysis from spec to materialized results...
-allowed-tools: Read, Write, Edit, Glob, Grep, Bash(astra:*), Bash(lc:*), ...
-argument-hint: "[DESCRIPTION] [--universe NAME] [--max-iterations N]"
+  Scope a new ASTRA analysis from a research question...
+allowed-tools: Read, Write(astra.yaml), Edit(astra.yaml), Glob, Grep, Bash(astra:*), ...
+argument-hint: "[DESCRIPTION]"
 ---
 ```
 
-The frontmatter configures Claude Code: which tools the skill may
-invoke, and what the slash command's argument hint looks like. The
-body is the prompt — phase definitions, rules, references to guide
-files, anti-patterns. The skill bundles its own helper scripts under
-`scripts/` and its loop prompt template under `assets/` when relevant.
+The frontmatter tells Claude Code which tools the skill may invoke
+and what the slash command's argument hint looks like. The body is the
+prompt itself: phase definitions, rules, references to guide files,
+anti-patterns. Skills bundle their own helper scripts under `scripts/`
+and longer prompt fragments under `assets/` when relevant.
 
 ## Plugin layout
 
 ```text
 claude/lightcone/
 ├── skills/
-│   ├── lc-new/SKILL.md
-│   ├── lc-build/{SKILL.md, assets/loop-prompt.md, scripts/setup-lc-build.sh}
-│   ├── lc-verify/SKILL.md
-│   ├── lc-migrate/SKILL.md
-│   └── lc-feedback/SKILL.md
-├── agents/lc-extractor.md             # subagent definition
-├── guides/                            # reference docs loaded by skills
+│   ├── lc-new/{SKILL.md, references/*.md}
+│   ├── lc-from-code/SKILL.md
+│   ├── lc-from-paper/{SKILL.md, references/*.md, templates/{constitution.md, CLAUDE.md}}
+│   ├── lc-feedback/SKILL.md
+│   ├── ralph/{SKILL.md, references/*.md, scripts/ralph}
+│   ├── paper-extraction/{SKILL.md, scripts/*.py}
+│   ├── narrative/{SKILL.md, references/*.md}
+│   ├── figure-comparison/{SKILL.md, scripts/*.py}
+│   ├── check-sentence-by-sentence/SKILL.md
+│   ├── astra/SKILL.md                  # reference: astra.yaml spec
+│   └── lc-cli/SKILL.md                 # reference: lc workflow
+├── agents/lc-extractor.md             # literature subagent for /lc-new
 ├── templates/CLAUDE.md                # the project CLAUDE.md template
-└── scripts/*.sh                       # session lifecycle hooks
+└── scripts/*.sh                       # session lifecycle hooks (incl. session-start primer)
 ```
 
 The plugin is force-included into the wheel via
 `pyproject.toml::tool.hatch.build.targets.wheel.force-include`, so
 `lc init` finds it whether you're running from source or PyPI.
 
-## Reference guides loaded by skills
+## Other plugin files
+
+The two reference *skills* (`/astra` and `/lc-cli`) live under `skills/` and are listed in the [Reference skills](#reference-skills-auto-primed-via-session-start) section above. Remaining plugin files:
 
 | File | Purpose |
 |------|---------|
-| `claude/lightcone/guides/astra-reference.md` | Full `astra.yaml` schema reference. Loaded by `lc-new`, `lc-build`, `lc-migrate`. |
-| `claude/lightcone/guides/lightcone-cli-reference.md` | CLI commands, status interpretation, failure diagnosis. Loaded by build/verify skills. |
-| `claude/lightcone/guides/ui-brand.md` | Visual formatting conventions for skill output. |
 | `claude/lightcone/agents/lc-extractor.md` | Literature extraction subagent invoked by `/lc-new`. |
+| `claude/lightcone/scripts/session-start.sh` | Session-start hook — surfaces validation + materialization status and primes Claude with the substrate CLIs and reference skill names. |
 
 ## Authoring a new skill
 
diff --git a/docs/skills/lc-build.md b/docs/skills/lc-build.md
deleted file mode 100644
index 53c36526..00000000
--- a/docs/skills/lc-build.md
+++ /dev/null
@@ -1,83 +0,0 @@
-# /lc-build
-
-Build an ASTRA analysis from spec to materialized results. Plans
-interactively, then loops autonomously via the ralph-wiggum stop hook
-until all outputs are materialized or `--max-iterations` is reached.
-
-Source: [`claude/lightcone/skills/lc-build/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-build/SKILL.md).
-
-Argument hint: `[DESCRIPTION] [--universe NAME] [--max-iterations N]`.
-Defaults: universe `baseline`, max-iterations `25`.
-
-## Allowed tools
-
-```text
-Read, Write, Edit, Glob, Grep,
-Bash(astra:*), Bash(lc:*), Bash(python:*), Bash(git:*), Bash(pip:*), Bash(mkdir:*),
-Bash(setup-lc-build:*),
-Agent, AskUserQuestion
-```
-
-## Phases
-
-### Phase 0 — Resume an interrupted loop
-
-If `.claude/ralph-loop.local.md` exists, ask the user via
-`AskUserQuestion` whether to resume or start fresh. Resume runs
-`setup-lc-build.sh --resume`; fresh deletes the state file.
-
-### Phase 1 — Plan (interactive)
-
-1. **Validate prerequisites** via `setup-lc-build.sh --validate
-   --universe <U> --max-iterations <N>`. Bails out with actionable
-   error messages if `astra.yaml`, the universe file, or required
-   tools are missing.
-2. **Read context** — `astra.yaml`, `CLAUDE.md`,
-   `.claude/guides/astra-reference.md`,
-   `.claude/guides/lightcone-cli-reference.md`,
-   `universes/<U>.yaml`, any existing `scripts/`.
-3. **Produce a plan** at `.lightcone/plans/build-plan-<U>.md` with:
-   analysis overview; dependency graph; decision selections; ordered
-   build checklist with per-output script / decisions / dependencies /
-   estimated cost; verification checklist.
-4. **Get approval** via `AskUserQuestion`: "Approve and start building"
-   vs "Let me edit the plan first."
-
-**Rule:** Phase 1 is read-only exploration. No code, no spec edits
-until the user approves.
-
-### Phase 2 — Loop (autonomous)
-
-Once approved, `setup-lc-build.sh --activate` writes
-`.claude/ralph-loop.local.md`. The Claude Code stop hook intercepts
-session exits and re-injects the loop prompt
-([`assets/loop-prompt.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-build/assets/loop-prompt.md))
-until the agent emits `<promise>BUILD_COMPLETE</promise>` or
-max-iterations is hit.
-
-Each iteration: survey state, decide what to do next, work, commit,
-exit. The plan file persists across crashes for easy resumption and
-is deleted on successful completion.
-
-## State files
-
-| File | Purpose |
-|------|---------|
-| `.lightcone/plans/build-plan-<universe>.md` | The user-approved plan. Persists across crashes. Deleted on completion. |
-| `.claude/ralph-loop.local.md` | Loop state: iteration count, max iterations, session id, universe. Used by the session-start hook to detect interruptions. |
-
-## Cancellation
-
-Mid-loop: `/cancel-ralph` (provided by the ralph-loop plugin).
-
-## Dependency on the ralph-loop plugin
-
-The loop machinery (the stop hook, `/cancel-ralph`) ships in a
-separate Claude Code plugin. `setup-lc-build.sh` will attempt to
-install it on demand from the marketplace; if installation fails it
-errors out and cleans up.
-
-## Related
-
-- [`/lc-verify`](lc-verify.md) — read-only audit, run after a successful build.
-- [`claude/lightcone/guides/lightcone-cli-reference.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/guides/lightcone-cli-reference.md) — CLI and execution reference loaded by the skill.
diff --git a/docs/skills/lc-feedback.md b/docs/skills/lc-feedback.md
index f1dab913..ab79f8c1 100644
--- a/docs/skills/lc-feedback.md
+++ b/docs/skills/lc-feedback.md
@@ -69,11 +69,11 @@ Sections that don't apply are dropped.
 
 ## Hard rules
 
-- Be fast — minimize back-and-forth, one confirmation then file.
-- Read-only on the project.
-- Trim aggressively — only the relevant portion of errors.
-- No sensitive data — strip absolute paths, credentials, tokens.
-- Don't editorialize — report what happened.
+- **Be fast.** Minimize back-and-forth: one confirmation, then file.
+- **Read-only on the project.**
+- **Trim aggressively.** Only the relevant portion of errors.
+- **No sensitive data.** Strip absolute paths, credentials, tokens.
+- **Don't editorialize.** Report what happened.
 
 ## Notes for the maintainer who's looking
 
diff --git a/docs/skills/lc-from-code.md b/docs/skills/lc-from-code.md
new file mode 100644
index 00000000..f0e2a31f
--- /dev/null
+++ b/docs/skills/lc-from-code.md
@@ -0,0 +1,84 @@
+# /lc-from-code
+
+Import an existing codebase into ASTRA. The skill scans the project,
+drafts `astra.yaml` against what the code already does, parameterizes
+its hardcoded analytical choices, and runs until outputs materialize.
+Existing logic stays intact; the edits are minimal parameter plumbing.
+
+Source: [`claude/lightcone/skills/lc-from-code/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-from-code/SKILL.md).
+
+## Allowed tools
+
+```text
+Read, Write, Edit, Glob, Grep,
+Bash(astra:*), Bash(lc:*), Bash(python:*), Bash(pip:*), Bash(git:*), Bash(mkdir:*), Bash(ls:*),
+Agent, AskUserQuestion
+```
+
+## Phases
+
+### Phase 1 — Scan & spec
+
+The skill spawns an `Explore` subagent (Claude Code's general-purpose
+search agent) with `/astra`'s Decisions criteria inlined into the
+prompt. The subagent returns a structured inventory:
+
+- **Per script/notebook**: file path, what it does, files it reads
+  and writes, hardcoded analytical choices (with `file:line`, current
+  value, what it controls), how it's invoked.
+- **Project-level**: dependency files, data files, any existing
+  container setup.
+
+The main agent keeps only the genuinely analytical choices (most
+hardcoded values are implementation details), drafts `astra.yaml` with
+`recipe:` blocks pointing at the existing scripts, and generates
+`universes/baseline.yaml` with defaults matching the current hardcoded
+values — so the first run reproduces existing behavior. `astra validate
+astra.yaml` then checks the spec, and the user reviews before Phase 2.
+
+### Phase 2 — Implement (parameterize)
+
+The approach depends on the shape of each script:
+
+- **Script with hardcoded values.** Add or extend `argparse`; replace
+  the hardcoded values with parsed args.
+- **Notebook.** Move the `.ipynb` to `notebooks/` (kept as reference)
+  and create a `.py` script that does the parameterized version. The
+  recipe points at the new script.
+- **Config-file-driven project.** Write a thin wrapper that accepts
+  ASTRA decision args, writes the config, and calls the original
+  entry point. The user's config-driven code stays untouched.
+
+Hard conventions enforced by the prompt:
+
+- Decision IDs use underscores (`outlier_sigma`), and lightcone-cli
+  passes them as `--outlier_sigma`. Argument parsing must match.
+- Each output is a *directory*, `results/{universe}/{output_id}/`. The
+  recipe receives `{output}` as that directory; scripts write artifacts
+  inside it (`{output}/data.parquet`).
+- Don't refactor, restructure, or "improve" existing code — parameter
+  plumbing only.
+
+### Phase 3 — Run & debug
+
+Run `lc run --universe baseline`, then iterate fixes until `lc status`
+shows every output `ok`. If the scan surfaced existing results
+elsewhere in the project, compare them against the new
+`results/baseline/<output_id>/` to confirm the migration preserved
+behavior. Re-validate with `astra validate astra.yaml` and present
+the summary.
+
+## Hard rules
+
+- **Minimal changes.** No refactor, no rename, no reorganize.
+- **Never guess.** Read every script before claiming what it does.
+- **Filter decisions aggressively.** Most hardcoded values are
+  implementation details, not decisions.
+- **Preserve behavior.** The baseline universe, with default values,
+  must reproduce the original exactly.
+
+## Related
+
+- [`/lc-new`](lc-new.md) — for greenfield analyses.
+- After migration, run `lc verify` to confirm the spec is valid and
+  the provenance chain is intact.
diff --git a/docs/skills/lc-from-paper.md b/docs/skills/lc-from-paper.md
new file mode 100644
index 00000000..89f3c53f
--- /dev/null
+++ b/docs/skills/lc-from-paper.md
@@ -0,0 +1,173 @@
+# /lc-from-paper
+
+Reproduce a published scientific paper as a complete ASTRA project. The
+skill is **ORIENT-first** and **ralph-driven**. ORIENT runs in the
+user's main session — figuring out what the user wants, standing up the
+paper and code substrate, and drafting the per-paper constitution. A
+ralph loop then carries the long middle — ARCHITECT → SPECIFY →
+LITERATURE → IMPLEMENT → RUN → COMPARE — across many iterations against
+the same constitution. REVIEW returns to the user's main session once
+the loop closes.
+
+`/lc-from-paper` is the entry point of the paper-reproduction bundle.
+Sibling skills ([`ralph`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/ralph/SKILL.md)
+for the loop, [`paper-extraction`](paper-extraction.md),
+[`narrative`](narrative.md), [`figure-comparison`](figure-comparison.md),
+[`check-sentence-by-sentence`](check-sentence-by-sentence.md)) live in
+the same plugin and are invoked by role across the phases.
+
+Source: [`claude/lightcone/skills/lc-from-paper/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-from-paper/SKILL.md).
+
+## Architecture
+
+Two pieces.
+
+1. **Interactive bookends in the user's main session.** ORIENT and
+   REVIEW are conversations with the user. ORIENT runs in stages —
+   ask for the paper, run `/paper-extraction` inline, interview
+   (grounded in the paper), clone the code and run `/lc-from-code`
+   scan-only (if a repo exists), optionally follow up, then draft
+   `constitution.md` + `CLAUDE.md` from the full paper-plus-code
+   context for user review.
+
+2. **A ralph loop for the long middle.** Once ORIENT lands —
+   `constitution.md` drafted, paper and code substrate on disk —
+   `/lc-from-paper` launches a ralph loop against the constitution.
+   Each iteration starts a fresh tmux-detached Claude session with
+   the constitution loaded into its system prompt, surveys the
+   workdir, picks the next valuable move (typically one phase's
+   worth of work), does it, commits, and exits. Iteration N+1 reads
+   N's work cold, so per-phase review collapses into "the next
+   iteration is the review." Parallel fan-out (LITERATURE Haiku
+   quote-finders, SPECIFY per-sub-analysis work, IMPLEMENT per-output
+   work) happens *inside* an iteration, one level deep from the
+   iteration's main session.
+
+## Phases
+
+Eight phases, zero-indexed. ORIENT + REVIEW run in the user's main
+session; phases 1–6 run as ralph iterations.
+
+| # | Phase | Where | Primary outputs |
+|---|-------|-------|------------------|
+| 0 | ORIENT | user's main session | per-paper `constitution.md` + `CLAUDE.md` + paper substrate at `work/reference/{paper.pdf, source/ or document.md, figures/, tables/, index.json, astra.yaml}` (from inline `/paper-extraction`) + code substrate at `work/reference/{code/, code-status.yaml, code-index.md}` (from inline `/lc-from-code` scan-only, when a repo exists) |
+| 1 | ARCHITECT | ralph iteration | stub `astra.yaml` (sub-analyses, inputs, outputs, narrative) |
+| 2 | SPECIFY | ralph iteration | filled `astra.yaml` (`decisions:`, `findings:`, `prior_insights:` placeholders, anchored narrative); `targets/targets.md`; `implementation-notes.md`; `universes/baseline.yaml` |
+| 3 | LITERATURE | ralph iteration | `prior_insights:` Evidence entries each carry resolved `quote:` + `location:` selectors; per-paper PDFs cached via `astra paper add` |
+| 4 | IMPLEMENT | ralph iteration | `scripts/`, `requirements.txt`, recipes in `astra.yaml` |
+| 5 | RUN | ralph iteration | `results/<universe>/<output>/` |
+| 6 | COMPARE | ralph iteration | `comparison-report.{yaml,md}` plus an opportunity assessment graded against the user's fidelity intent |
+| 7 | REVIEW | user's main session | `REPRODUCTION-SUMMARY.md`, `/figure-comparison` HTML, resolved `open-questions.md`, finalized reproduction outcome |
+
+## ORIENT stages
+
+ORIENT is one phase executed in seven stages, each grounded in what
+the earlier stages produced:
+
+1. **Ask for the paper** in prose (the answer is free-form: arXiv ID,
+   DOI, or PDF path). No `AskUserQuestion` here — it's the wrong
+   shape for a free-form string.
+2. **Run `/paper-extraction <id>` inline** and read the substrate
+   it produced — index.json, abstract, conclusions, data/code
+   availability, acknowledgements. This grounds every subsequent
+   question.
+3. **Interview the user** with `AskUserQuestion` for scope, fidelity
+   intent, code repo confirmation, paper-specific conventions, prior
+   familiarity, and external context — each question referencing the
+   paper's actual figures, claims, and structure.
+4. **Clone the reference code and run `/lc-from-code` scan-only**
+   (skip cleanly when no public code repo exists). The scan produces
+   `code-index.md` — the iterations' code surface.
+5. **Optional follow-up questions** if the code-index surfaced
+   something that affects scope or constitution shape. Usually
+   skipped.
+6. **Draft `constitution.md` + `CLAUDE.md`** — both files now
+   informed by paper *and* code substrate. The constitution's Scope
+   and sub-analysis decomposition can lean on the actual pipeline.
+7. **User reviews drafts → refine → single first commit (constitution
+   + CLAUDE + paper substrate + code substrate) → launch the ralph
+   loop.**
+
+## Per-paper substrate: constitution + CLAUDE.md
+
+ORIENT drafts two files in the reproduction workdir; every iteration
+picks them up on launch.
+
+- **`constitution.md`** — the ralph loop's driving document, *task-bound*.
+  YAML frontmatter declares `status: active`. Goal (carrying the
+  **fidelity intent** — the user's own "what do you want out of this
+  stretch, given what you have to spend on it"), Scope (in/out),
+  Quality bar, Evidence (paper DOI, arXiv ID, code repo URL), Open
+  dimensions (decisions worth user ratification, updated each
+  iteration). The body sharpens slowly. Archivable once the
+  reproduction closes.
+- **`CLAUDE.md`** — the auto-loading walk-up, *durable*. Paper identity
+  at the top; Rules (code-as-canonical, no blocking on `AskUserQuestion`
+  mid-iteration, arXiv-LaTeX-first, `astra validate --verify-evidence`
+  as the fidelity gate); Disagreements log (running); Open opportunities
+  (gaps that future work could tighten); Pointers. Stays useful for any
+  follow-on work in this directory.
+
+Pointers, not snapshots.
+
+## Disciplines
+
+- **Workdir is the state.** File existence, `git log`, and `astra
+  validate` answer "what phase am I on" deterministically — no
+  separate state machine.
+- **Constitution is task-bound; CLAUDE.md is durable.** The constitution
+  carries what *this reproduction* is trying to achieve and how it's
+  progressing — archivable once the reproduction closes. CLAUDE.md carries
+  what stays useful past the reproduction: paper identity, rules,
+  paper-vs-code disagreements, pointers to substrate. Keep both current
+  so the next cold survey reads them as fact.
+- **Code-as-canonical, with disagreements recorded.** Where paper
+  and code disagree on something material, code wins for numerics,
+  but the disagreement is preserved as a decision option and noted
+  in CLAUDE.md.
+- **Rigor is a trajectory toward the user's intent.** Fidelity
+  intent is partly aesthetic ("how good does this need to be?") and
+  partly pragmatic ("what's feasible given the compute, tokens, and
+  wall-clock available?"). The honest meta-conversation lives in
+  ORIENT. There's no explicit review state machine: every iteration
+  reads the most recent artifact critically as part of survey,
+  fixes what needs fixing or advances if nothing does. The fresh-context
+  property at iteration boundaries makes the next iteration the
+  review. Gaps the intent wants pushed further than the loop has
+  time to deliver become Open opportunities in CLAUDE.md for a future
+  loop.
+- **arXiv LaTeX first.** PDF + Docling is the non-arXiv fallback only.
+- **No synthetic data.** Unless the paper itself uses synthetic data,
+  every input must be real.
+- **Open questions for autonomous iteration.** Iterations run detached
+  in tmux, so `AskUserQuestion` isn't available. Questions go to
+  `open-questions.md` with the iteration's best-judgment default
+  applied; the user resolves them at REVIEW close-out.
+
+## Anti-patterns
+
+- Doing the long middle in the user's main session instead of launching
+  the loop. ORIENT and REVIEW belong in the main session; ARCHITECT
+  through COMPARE belong in iterations.
+- Asking an iteration to use `AskUserQuestion` — iterations are
+  detached.
+- Re-implementing what `astra` already does (`astra validate`, `astra
+  paper add`).
+- Bundling phases into one iteration — defeats fresh-context review.
+- Accreting amendment sections in `constitution.md` — reshape, don't
+  append.
+
+## Related
+
+- [Bundle README](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/README.md)
+  — why the bundle is co-located rather than a separate plugin install.
+- [`/ralph`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/ralph/SKILL.md)
+  — the loop substrate (authoring + launching + iterating).
+- [`/paper-extraction`](paper-extraction.md) — ORIENT Stage 2's
+  acquisition path; also invoked per cited paper by LITERATURE.
+- [`/narrative`](narrative.md) — ARCHITECT's structural narrative and
+  SPECIFY's anchored content narrative.
+- [`/figure-comparison`](figure-comparison.md) — REVIEW (mandatory) and
+  also user-invokable.
+- [`/check-sentence-by-sentence`](check-sentence-by-sentence.md) —
+  REVIEW (opt-in) and also user-invokable.
diff --git a/docs/skills/lc-migrate.md b/docs/skills/lc-migrate.md
deleted file mode 100644
index 230fcc07..00000000
--- a/docs/skills/lc-migrate.md
+++ /dev/null
@@ -1,85 +0,0 @@
-# /lc-migrate
-
-Migrate an existing project into ASTRA / lightcone-cli. Scans the
-code, generates `astra.yaml`, parameterizes hardcoded analytical
-choices, and runs until outputs materialize. Existing logic stays
-intact — changes should be minimal.
-
-Source: [`claude/lightcone/skills/lc-migrate/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-migrate/SKILL.md).
-
-## Allowed tools
-
-```text
-Read, Write, Edit, Glob, Grep,
-Bash(astra:*), Bash(lc:*), Bash(python:*), Bash(pip:*), Bash(git:*), Bash(mkdir:*), Bash(ls:*),
-Agent, AskUserQuestion
-```
-
-## Phases
-
-### Phase 1 — Scan & spec
-
-The skill spawns an `Explore` subagent (Claude Code's general-purpose
-search agent) with the decision criteria from `astra-reference.md`
-inlined into the prompt. The subagent returns a structured inventory:
-
-- Per script/notebook: file path, what it does, files it reads & writes,
-  hardcoded analytical choices (with file:line, current value, what it
-  controls), how it's invoked.
-- Project-level: dependency files, data files, existing container
-  setup.
-
-The main agent filters the candidate decisions down to true analytical
-choices (most hardcoded values are implementation details, not
-decisions), drafts `astra.yaml` with `recipe:` blocks pointing at the
-existing scripts, and generates `universes/baseline.yaml` with all
-defaults matching the current hardcoded values — so the first run
-reproduces existing behavior. Spec is then validated with
-`astra validate astra.yaml`.
-
-The user is asked to review before Phase 2.
-
-### Phase 2 — Implement (parameterize)
-
-The skill picks an approach per script type:
-
-- **Script with hardcoded values** — add (or extend) argparse, replace
-  hardcoded values with parsed args.
-- **Notebook** — move the `.ipynb` to `notebooks/` (preserved as
-  reference), create a `.py` script that does the parameterized
-  version. The recipe points at the new script.
-- **Config-file-driven project** — write a thin wrapper script that
-  accepts ASTRA decision args, writes the config, then calls the
-  original entry point. The user's config-driven code stays untouched.
-
-Hard conventions enforced by the prompt:
-
-- Decision IDs use underscores in `astra.yaml` (`outlier_sigma`).
-  lightcone-cli passes `--outlier_sigma`. Argument parsing must match.
-- Output paths follow `results/{universe}/{output_id}.ext` (the
-  per-output convention).
-- Don't refactor, restructure, or "improve" existing code — only
-  parameter plumbing.
-
-### Phase 3 — Run & debug
-
-`lc run --universe baseline`. Iterate fixes until `lc status` shows all
-outputs `ok`. If the scan turned up existing results elsewhere in the
-project, compare them against the new `results/baseline/` to verify
-the migration preserved behavior. Then `astra validate astra.yaml` and
-present the summary.
-
-## Hard rules
-
-- Minimal changes — no refactor, rename, reorganize.
-- Never guess — read every script before claiming what it does.
-- Filter decisions aggressively — most hardcoded values are
-  implementation details.
-- Preserve behavior — the baseline universe with default values must
-  reproduce the original behavior exactly.
-
-## Related
-
-- [`/lc-new`](lc-new.md) — for greenfield analyses.
-- [`/lc-verify`](lc-verify.md) — run after migration to confirm
-  spec-code-results alignment.
diff --git a/docs/skills/lc-new.md b/docs/skills/lc-new.md
index 89fa234d..038caace 100644
--- a/docs/skills/lc-new.md
+++ b/docs/skills/lc-new.md
@@ -1,8 +1,9 @@
 # /lc-new
 
-Scope a new ASTRA analysis through conversation. Produces a complete
-`astra.yaml` (and optionally a literature evidence trail) with no code
-written.
+Scope a new ASTRA analysis from a research question, through
+conversation. The output is a complete `astra.yaml` and (optionally) a
+literature evidence trail. Implementation comes later — `/lc-new`
+writes spec, not code.
 
 Source: [`claude/lightcone/skills/lc-new/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-new/SKILL.md).
 
@@ -11,59 +12,61 @@ Source: [`claude/lightcone/skills/lc-new/SKILL.md`](https://github.com/Lightcone
 ```text
 Read, Write(astra.yaml), Write(universes/*), Write(CLAUDE.md),
 Edit(astra.yaml), Edit(universes/*), Edit(CLAUDE.md),
-Glob, Grep, Bash(astra:*), Bash(lc:*), Bash(mkdir:*), Bash(echo:*),
-WebSearch, WebFetch, AskUserQuestion, Task
+Glob, Grep, Bash(astra:*), Bash(lc:*),
+WebSearch, WebFetch, AskUserQuestion, Agent
 ```
 
-The skill is locked to spec-only writes — it cannot write Python, R, or
-arbitrary files. The lc-extractor subagent is invoked via `Task`.
+Writes are locked to the spec surface — no Python, no R, no arbitrary
+files. The `lc-extractor` subagent is dispatched via `Agent`.
 
 ## Phases
 
-1. **Research question** — sharpen the question, write `version`, `name`,
-   `description` to `astra.yaml` immediately so the user sees progress.
-2. **Analysis structure** — walk through inputs, outputs, sub-analyses.
-   One output per output: a single metric, a single plot, a single
-   artifact. Updates `astra.yaml` with `inputs:` and `outputs:`.
-3. **Deep dive** (per section) — optional literature pass. Collect paper
-   candidates; for each approved paper, spawn one `lc-extractor`
-   subagent (parallel, via `Task`). Each subagent reads the PDF, pulls
-   verbatim quotes, runs `astra paper verify-quotes` to machine-verify
-   the quotes against the source, and returns extracted prior insights.
-   Then identify decisions informed by the conversation + literature
-   and write them to `astra.yaml`.
-4. **Finalize** — `astra validate astra.yaml`, `astra validate
-   --verify-evidence` if quotes exist, `astra universe generate -n
-   baseline`, populate the `narrative:` block in `astra.yaml` (`summary`,
-   `methods`, `inputs`, `outputs` — `findings` stays TODO until results
-   exist), then populate the `## Working Notes` section of `CLAUDE.md`
-   with conversational context not captured in the spec.
+1. **Research question.** Sharpen the question, then write `version`,
+   `name`, and `description` to `astra.yaml` so the user has something
+   visible to react to from the first turn.
+2. **Analysis structure.** Walk through inputs, outputs, and any
+   sub-analyses. One output per output: one metric, one plot, one
+   artifact. `inputs:` and `outputs:` land in `astra.yaml` as they
+   crystallize.
+3. **Deep dive (per section).** An optional literature pass. Collect
+   paper candidates with the user; for each approved paper, dispatch
+   one `lc-extractor` subagent in parallel. Each subagent reads the
+   PDF, pulls verbatim quotes, runs `astra paper verify-quotes` to
+   machine-verify the quotes against the source, and returns prior
+   insights. Decisions then fall out of the conversation and the
+   literature together.
+4. **Finalize.** `astra validate astra.yaml`; `astra validate
+   --verify-evidence` if quotes exist; `astra universe generate -n
+   baseline`. Populate the `narrative:` block (`summary`, `methods`,
+   `inputs`, `outputs` — `findings` stays TODO until results exist),
+   then fill the `## Working Notes` section of `CLAUDE.md` with
+   conversational context the spec doesn't carry.
 
-The skill writes to `astra.yaml` after each phase rather than in bulk
-at the end so the user has something visible to review at every step.
+Writes happen at the end of each phase, not in bulk — the user always
+has something visible to review.
 
 ## Hard restrictions (from the SKILL.md)
 
-- Specification agent only — cannot write Python, R, or other
-  implementation code.
-- Files it may touch: `astra.yaml`, `universes/*.yaml`, `CLAUDE.md`
+- Specification agent only. No Python, no R, no implementation code.
+- Touchable files: `astra.yaml`, `universes/*.yaml`, and `CLAUDE.md`
   (Finalize only).
-- Never fabricates quotes — all evidence must pass
+- Quotes are never fabricated; every evidence entry must pass
   `astra validate --verify-evidence`.
-- PDFs are read by lc-extractor subagents only; the main agent never
-  pulls a PDF into its own context.
+- PDFs stay inside `lc-extractor` subagents — the main agent never
+  pulls one into its own context.
 
 ## Anti-patterns called out in the prompt
 
 - Bulk-writing decisions at the end instead of after each crystallizes.
-- Accepting vague goals like "analyze this data" without sharpening.
-- Method-only decisions; the prompt actively probes for data
-  exclusion, variable operationalization, inference criteria.
+- Letting vague goals like "analyze this data" pass without sharpening.
+- Method-only decisions. The prompt actively probes data exclusion,
+  variable operationalization, and inference criteria.
 - Reading PDFs in the main agent context.
 - Skipping `astra validate --verify-evidence`.
 
 ## Related
 
-- [`/lc-build`](lc-build.md) — the next step after `/lc-new`.
-- [`claude/lightcone/guides/astra-reference.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/guides/astra-reference.md) — `astra.yaml` schema, decision criteria, prior insights / findings, universe management.
+- After `/lc-new`, ask the agent to implement the spec through the
+  normal Claude Code workflow.
+- [`/astra`](index.md#reference-skills-auto-primed-via-session-start) — `astra.yaml` schema, decision criteria, prior insights / findings, universe management.
 - [`claude/lightcone/agents/lc-extractor.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/agents/lc-extractor.md) — the literature extraction subagent definition.
diff --git a/docs/skills/lc-verify.md b/docs/skills/lc-verify.md
deleted file mode 100644
index 08105967..00000000
--- a/docs/skills/lc-verify.md
+++ /dev/null
@@ -1,59 +0,0 @@
-# /lc-verify
-
-Read-only audit. Checks that `astra.yaml`, the code, and the
-materialized results all agree.
-
-Source: [`claude/lightcone/skills/lc-verify/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/lc-verify/SKILL.md).
-
-## Allowed tools
-
-```text
-Read, Glob, Grep,
-Bash(astra:*), Bash(lc:*), Bash(python:*), Bash(ls:*),
-AskUserQuestion
-```
-
-No `Write`, no `Edit`. The skill cannot modify the project.
-
-## What it checks (per universe; default `baseline`)
-
-1. **Spec validation** — `astra validate astra.yaml`. Fix and iterate
-   until clean.
-2. **Materialization status** — `lc status --universe <U>`. Every
-   output should be `ok`. Anything `stale`, `missing`, or `alias`
-   that's not expected gets flagged.
-3. **Decision-code alignment** — *the core value*. For every decision
-   in `astra.yaml`, confirm the code accepts it as a parameter rather
-   than hardcoding the value. Cross-checks `astra info --decisions`
-   against argparse usage in `scripts/`.
-4. **Results match spec** — for every output, verify the result files
-   exist and look well-formed. For `type: metric` outputs, check that
-   each JSON file parses and contains a `{"value": …}` entry.
-
-## Report format
-
-```text
-| Check                    | Status |
-|--------------------------|--------|
-| Spec validation          | ✓/✗    |
-| Materialization (N/N)    | ✓/✗    |
-| Decision-code alignment  | ✓/⚠/✗  |
-| Results match spec (N/N) | ✓/✗    |
-```
-
-The skill lists each finding with file paths and line numbers, and
-suggests concrete fixes when something fails.
-
-## Hard rules
-
-- Read-only — never modifies files.
-- One universe at a time.
-- Never skips the decision-code alignment check.
-- Always reads actual result files; never infers from code.
-
-## Related
-
-- [`/lc-build`](lc-build.md) — fix anything `/lc-verify` flags.
-- [`lc verify`](../cli/verify.md) — the deeper, hash-based audit on the
-  CLI side. They complement each other: the skill checks
-  spec-vs-code-vs-results alignment; the CLI checks data integrity.
diff --git a/docs/skills/narrative.md b/docs/skills/narrative.md
new file mode 100644
index 00000000..5a13e69f
--- /dev/null
+++ b/docs/skills/narrative.md
@@ -0,0 +1,107 @@
+# /narrative
+
+Author the reader-facing prose in an `astra.yaml`: analysis-level
+`narrative:` blocks (`summary`, `inputs`, `methods`, `findings`,
+`outputs`), decision `rationale:` fields, and shorter `description:` /
+`notes:` on individual entities. Always written against an existing
+spec — the structure must exist when the prose lands.
+
+Source: [`claude/lightcone/skills/narrative/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/narrative/SKILL.md).
+
+## Modes
+
+The skill draws on the spec plus a **second source**. Three modes,
+distinguished by what that second source is:
+
+| Mode | Second source | Status |
+|---|---|---|
+| **Paper reproduction** | An authoritative text (paper, thesis, technical report) | Ready |
+| **Retrofit** | Project artifacts — code, notebooks, fibers, commit history | Stub |
+| **Co-drafting** | The user, in conversation | Stub |
+
+If the second source isn't obvious, the skill asks. Hybrid is allowed
+(reproduction with co-drafted extensions; retrofit with co-drafted
+gap-filling).
+
+`/lc-from-paper` invokes `/narrative` during SPECIFY (paper-reproduction
+mode); users can invoke it directly in any mode.
+
+## Allowed surfaces
+
+The five-key analysis narrative:
+
+| Key | What it carries | Required when |
+|---|---|---|
+| `summary` | Question, scope, headline shape | optional, but should always exist |
+| `inputs` | Provenance — the data the analysis rests on | `Analysis.inputs` non-empty |
+| `methods` | Pipeline walk; cite each decision and sub-analysis by anchor | `Analysis.decisions` or `Analysis.analyses` non-empty |
+| `findings` | Synthesis of declared findings, each cited by anchor | `Analysis.findings` non-empty |
+| `outputs` | Which artifacts were promoted, and where they go downstream | `Analysis.outputs` non-empty |
+
+A decision's `rationale:` is its own one-paragraph slot: what was
+decided, the insight that motivated it (cite by anchor), and the
+load-bearing alternative and why it lost. Per-entity prose
+(`description`, `notes`) is shorter and lives on individual entries.
+
+## Anchors
+
+Markdown link syntax with `#`-target, **tree-path-first** — same
+grammar as decision `from:` references.
+
+| Target | Anchor |
+|---|---|
+| Input | `#inputs.<id>` |
+| Output | `#outputs.<id>` |
+| Decision | `#decisions.<id>` |
+| Option | `#decisions.<id>.options.<opt>` |
+| Finding | `#findings.<id>` |
+| Prior insight | `#prior_insights.<id>` |
+| Sub-analysis | `#analyses.<sub>` |
+| Element inside a sub-analysis | `#<sub>.<category>.<id>` |
+| Parent scope from a sub-analysis | `#../decisions.<id>` |
+
+Anchor text is **authored prose**, never the raw id. One reference per
+idea — stacking three on a sentence means the sentence carries too
+much.
+
+## Length and modularity
+
+1–3 paragraphs per key, at any level. Length is the mechanism that
+keeps analyses modular: **if references don't fit in three paragraphs,
+the analysis is too big — split it.** The narrative is a compressor;
+if it won't compress, split the thing being compressed.
+
+## Validation
+
+```sh
+astra validate astra.yaml
+```
+
+- **Broken references** → error.
+- **Uncited declared elements** → warning. Every declared finding,
+  decision, output, and sub-analysis must be cited somewhere in the
+  narrative tree.
+- **Conditional coverage** (required-when rules above) → error.
+
+## Anti-patterns
+
+- **Wiki-style what-is framing.** A wiki summarizes; an ASTRA narrative
+  points into reasoning.
+- **Decision-list paragraph.** "We made the following decisions: A, B,
+  C." Cite each where it shapes the pipeline.
+- **`summary` as primer.** Teaching what the field is. Readers arrive
+  with context.
+- **Drafting `findings` on a sub-analysis with no declared findings.**
+  Skip the key.
+- **Narrative-per-element.** The five-key analysis narrative is the
+  only home; per-element prose is `description` / `rationale` /
+  `notes`.
+
+Mode-specific anti-patterns live in each mode's reference under
+`claude/lightcone/skills/narrative/references/`.
+
+## Related
+
+- [`/lc-from-paper`](lc-from-paper.md) — invokes `/narrative` during
+  SPECIFY in paper-reproduction mode.
+- [`/astra`](index.md#reference-skills-auto-primed-via-session-start) — full schema reference.
diff --git a/docs/skills/paper-extraction.md b/docs/skills/paper-extraction.md
new file mode 100644
index 00000000..7c7cc05f
--- /dev/null
+++ b/docs/skills/paper-extraction.md
@@ -0,0 +1,131 @@
+# /paper-extraction
+
+Turn an arXiv ID or DOI into a standardized, indexed `work/reference/`
+directory: substrate (arXiv LaTeX source preferred, PDF + Docling
+fallback), copied figures, per-table `.tex` files, a section outline
+with line numbers, deduplicated citation keys with resolved DOIs, the
+abstract, and a stub `astra.yaml` that treats the paper as an ASTRA
+artifact.
+
+Source: [`claude/lightcone/skills/paper-extraction/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/paper-extraction/SKILL.md).
+
+Argument hint: `<arxiv-id-or-doi>` — invoked as `/paper-extraction
+2503.19441` or `/paper-extraction 10.48550/arXiv.2503.19441`.
+
+## Allowed tools
+
+```
+Read, Write, Edit, Bash, Grep, Glob, WebFetch, WebSearch
+```
+
+The agent runs `scripts/extract-paper-substrate.py` for the
+deterministic structural pass, then walks any warnings and (optionally)
+fills `findings:`.
+
+## Outputs
+
+Under `work/reference/` (idempotent — re-runs skip what's already done):
+
+```
+work/reference/
+├── index.json                # structural index — figures, tables, outline, citations (with DOIs), paths
+├── astra.yaml                # semantic — the paper as an ASTRA artifact (findings populated in Step 5)
+├── paper.pdf                 # always
+├── paper.tex                 # Path A — symlink to the main .tex file
+│   (or)
+├── document.md               # Path B — Docling-extracted markdown
+├── source/                   # Path A — extracted arXiv tarball
+├── figures/                  # copied figure files
+├── tables/                   # one .tex file per `\begin{table}` block (Path A)
+├── bibliography-source.bib   # Path A — copied from source
+├── bibliography-source.bbl   # Path A — copied from source
+└── .doi-cache.json           # Crossref/ADS lookup cache for idempotency
+```
+
+The skill produces only the paper's own reading materials. Code
+repositories and supplementary datasets are out of scope; the caller
+handles those.
+
+## Two surfaces
+
+**`index.json` is structural and machine-friendly.** Everything the
+script mechanically extracts: figures, tables, section outline with
+line numbers, citation keys (with every location *plus* the cited
+paper's full citation text and resolved DOI), abstract, paths. Read
+this when you want "what's in this paper, where do I find it." DOI
+resolution covers ~96% of typical-paper bibliographies.
+
+**`astra.yaml` is semantic and ASTRA-validating.** Treats the paper as
+an ASTRA artifact: `id`, `name`, `narrative.summary`, and `findings:`
+carrying the paper's claimed numerical results in the Insight +
+Evidence shape. The verbosity of the shape *is* the back-pressure
+against hallucinated claims — the agent has to find and quote actual
+text.
+
+## Workflow
+
+1. **Survey.** `ls work/reference/`; read `index.json` if present. Skip
+   any work already done.
+2. **Acquire substrate.** Path A (arXiv → LaTeX source) or Path B
+   (journal-only DOI → PDF + Docling).
+3. **Run the extraction script.** `extract-paper-substrate.py` does
+   the deterministic structural pass: figure copying, per-table `.tex`
+   extraction, outline, citation resolution, `astra.yaml` stub.
+4. **Review warnings and fix structural gaps.** Unresolved figures,
+   missing captions, unresolved citation DOIs, Path B caveats.
+5. **(Optional) Walk the paper for findings.** Append the paper's
+   central numerical claims to `astra.yaml`'s `findings:` map with
+   verbatim `quote.exact` evidence. Skip unless a downstream consumer
+   needs it.
+
+Path A is preferred whenever the paper is on arXiv — equations,
+ligatures, captions, and tables come through clean. Path B is for
+non-arXiv only.
+
+## Citation DOI resolution
+
+The resolver tries, in order: the entry's `doi:` field → an
+`eprint:`-derived arXiv DOI → Crossref bibliographic query (free, no
+API key) → ADS title search (only if `ADS_API_TOKEN` env var or
+`~/.ads/dev_key` is present — graceful skip otherwise). Title hits
+from Crossref are gated by a similarity check against the queried
+title.
+
+## Findings as Insight + Evidence
+
+When Step 5 runs, each finding carries `claim:` plus verbatim `quote.
+exact` anchored to the paper's DOI:
+
+```yaml
+findings:
+  s8_constraint:
+    claim: "S_8 = sigma_8 (Omega_m / 0.3)^0.5 = 0.795 ± 0.014 ..."
+    created_at: "2026-04-04T00:00:00Z"
+    evidence:
+      - doi: "10.48550/arXiv.2604.03227"
+        version: 1
+        quote:
+          exact: "we find $S_8 = 0.795 \\pm 0.014$"
+```
+
+`astra validate --verify-evidence` searches for `quote.exact` in the
+cached PDF — paraphrasing breaks the gate.
+
+## Discipline
+
+- **Quote verbatim.** Copy LaTeX as it appears in `paper.tex`. Don't
+  paraphrase, expand macros, or normalize math.
+- **Every evidence carries `doi:` and `version:`** (the arXiv version,
+  e.g. `1`, `2`).
+- **Read abstract and conclusions first.** Most central findings sit
+  in one of those two surfaces.
+- **Re-runs are safe.** The script preserves agent edits to
+  `astra.yaml` once the stub exists.
+
+## Related
+
+- [`/lc-from-paper`](lc-from-paper.md) — invokes `/paper-extraction`
+  during ORIENT Stage 2 for the target paper, and again from inside a
+  ralph iteration for each cited paper during LITERATURE; each iteration
+  reads `index.json` and the substrate directly.
+- [`/astra`](index.md#reference-skills-auto-primed-via-session-start) — Insight + Evidence shape, `quote.exact` rules.
diff --git a/docs/skills/ralph.md b/docs/skills/ralph.md
new file mode 100644
index 00000000..91a911d7
--- /dev/null
+++ b/docs/skills/ralph.md
@@ -0,0 +1,102 @@
+# /ralph
+
+Author a constitution — a markdown document describing a desired state
+for autonomous iteration — and run a ralph loop against it. The loop
+is a detached tmux session that respawns a fresh worker per iteration,
+with the constitution injected as system prompt. Iterations terminate
+when one of them, after a cold survey, flips the constitution's
+frontmatter `status:` to `closed`.
+
+Used by [`/lc-from-paper`](lc-from-paper.md) for the long middle of a
+reproduction (ARCHITECT → SPECIFY → LITERATURE → IMPLEMENT → RUN →
+COMPARE). Standalone for any other long-running work where adaptation
+matters more than a fixed plan: refactors, exploratory analyses,
+research narratives that keep growing.
+
+Source: [`claude/lightcone/skills/ralph/SKILL.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/ralph/SKILL.md).
+
+## Three modes
+
+One mode applies at a time.
+
+- **Authoring** — drafting a constitution from scratch (Study → Draft
+  → Refine → Launch). Reference depth in
+  [`references/constitution.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/ralph/references/constitution.md)
+  and the careful-thinking rhythm in
+  [`references/crafting.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/ralph/references/crafting.md).
+- **Launching** — outside any active loop, invoking the bundled script
+  to start one on an existing constitution.
+- **Inside a loop** — the constitution is in the system prompt; the
+  worker follows the Loop protocol (Survey → Work → Update → Exit).
+
+## Launching
+
+After `lc init` copies the bundle into a project, the launcher lives at
+`.claude/skills/ralph/scripts/ralph`:
+
+```bash
+.claude/skills/ralph/scripts/ralph <constitution.md> [--backend claude|codex] [-- extra-flags...]
+```
+
+The constitution must have `status: open` or `status: active` in YAML
+frontmatter; the launcher refuses to start otherwise. Termination is
+automatic when an iteration flips `status:` to `closed`.
+
+The session detaches as `ralph-<dirname>-<basename>`. Attach with
+`tmux attach -t <session>`. A second launch with the same constitution
+detects the existing session and prints the attach command instead of
+double-starting.
+
+## What goes in a constitution
+
+A constitution describes what the system looks like when it's right —
+the desired state. It outlasts any single iteration; nothing in it
+goes stale as the work progresses. The constitutional principle:
+write what stays true until the work is done.
+
+Common sections — use what fits, skip what doesn't:
+
+- **Desired State** — what "done" looks like. Invariants, quality bar,
+  done-conditions. Fence the scope.
+- **Context** — file paths, existing patterns, architectural constraints.
+- **Skills** — which skills to activate before working.
+- **Evidence** — how to check progress (commands, test suites, grep
+  patterns).
+- **Open Questions** — uncertainties the user weighs in on between
+  loops.
+
+See the SKILL's *What goes in a constitution* and
+[`references/constitution.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/ralph/references/constitution.md)
+for the discipline that keeps a constitution from sliding into a plan.
+
+## Authoring principles
+
+- **Constitution, not plan.** Say what the system looks like when it's
+  right. Never describe the current state.
+- **Pointers, not snapshots.** "Check `grep -r 'old_pattern'`", not
+  "50 files remain." Snapshots go stale; pointers stay valid.
+- **Reshape, don't accrete.** When the desired state evolves, rewrite
+  the affected sections — don't tack on "Round 2" or "Amendments."
+- **Constraints need reasons.** Bare constraints get circumvented.
+- **Scope is a gift.** A clear fence frees iterations to work
+  confidently inside it.
+
+## Loop discipline
+
+Each iteration: Survey → Work → Update → Exit (`kill $PPID`). The
+survey is a fixed cost; exit when the next valuable move needs a
+different mental workspace, not when one task ends. Exit before context
+is half-full — the handoff matters more than the marginal step you'd
+squeeze in.
+
+**Closing the constitution is reserved for cold surveys that find
+nothing left to do.** If an iteration made any changes, it may not flip
+`status:` to `closed`; that decision waits for the next fresh-eyes
+iteration. This adds at least one cold review pass on every closing
+decision.
+
+## Related
+
+- [`/lc-from-paper`](lc-from-paper.md) — uses `/ralph` for the long
+  middle of a reproduction.
+- [Bundle README](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/README.md).
diff --git a/docs/user/agent-workflow.md b/docs/user/agent-workflow.md
index de317c45..9ebf6b63 100644
--- a/docs/user/agent-workflow.md
+++ b/docs/user/agent-workflow.md
@@ -1,9 +1,16 @@
 # The Agentic Workflow
 
-The agentic surface is five slash commands. Each one is a structured
-prompt — the agent follows a specific phased flow, not free-form chat.
-This page walks through each of them in the order you'd naturally hit
-them.
+The agentic surface is three entry slash commands plus feedback. The
+`/lc-from-*` family is parallel by what you start from — a question,
+code, or a paper — and `/lc-feedback` handles bug reports. Each one is
+a structured prompt: the agent follows a specific phased flow, not
+free-form chat. This page walks through each of them in the order you'd
+naturally hit them.
+
+The skills are structured entry points; they aren't requirements. Once
+you're inside a project, you can also just describe what you're working
+on to Claude — `astra.yaml` and the `lc` CLI keep things tracked
+whether you go through a skill or not.
 
 > The bracketed `→ astra.yaml` etc. notes show what each phase actually
 > writes to disk. You stay in charge of approving everything; the agent
@@ -40,61 +47,19 @@ The skill walks you through four phases:
    gets the conversational context that wouldn't otherwise survive a
    `/clear`.
 
-You don't write any code or YAML during `/lc-new`. By the time it
-finishes, you have a precise specification. The agent enforces this:
-the skill is *only allowed* to edit `astra.yaml`, files in
+You don't write any code or YAML during `/lc-new`. By the
+time it finishes, you have a precise specification. The agent enforces
+this: the skill is *only allowed* to edit `astra.yaml`, files in
 `universes/`, and `CLAUDE.md`.
 
-## `/lc-build` — implement and run
-
-**You have a scoped `astra.yaml`. You end with materialized outputs.**
-
-This is the longest-running skill. It has two phases.
-
-**Phase 1: plan.** The agent reads the spec, the universe file, and
-your existing scripts (if any), and writes a plan to
-`.lightcone/plans/build-plan-<universe>.md`. The plan covers
-dependencies, decision selections, ordered build checklist, and
-verification steps. It asks you to approve before doing anything else.
-
-**Phase 2: loop.** Once you approve, the skill activates an
-*autonomous loop*: the agent works through the plan, writes scripts,
-runs `lc run` to materialize outputs, fixes failures, and commits as
-it goes. The loop keeps going until either every output is
-materialized or it hits its iteration limit (default 25).
-
-You can interrupt the loop at any time. If you do, the next time you
-run `/lc-build` it asks whether to resume or start fresh.
-
-The plan file persists across crashes; only successful completion
-deletes it.
-
-## `/lc-verify` — audit a finished build
-
-**You have materialized outputs. You end with a verification report.**
-
-Read-only. Four checks:
-
-1. `astra validate astra.yaml` passes.
-2. `lc status` shows every output `ok` for the universe in question.
-3. **Decision-code alignment** (the most important check). For every
-   decision in the spec, the agent verifies the code accepts that
-   decision as a parameter — i.e. the value isn't silently hardcoded.
-4. Result files exist and look well-formed (a `type: metric` output
-   should be parseable JSON, etc.).
-
-The skill never modifies anything. If it finds a discrepancy, it
-suggests concrete fixes; you re-run `/lc-build` (or fix by hand) and
-re-verify.
-
-## `/lc-migrate` — wrap existing code
+## `/lc-from-code` — wrap existing code
 
 **You have a folder of scripts. You end with an ASTRA project around
 them.**
 
 When you have an existing analysis (a notebook, a folder of `.py`
-files, a config-driven pipeline), `/lc-migrate` does the wrapping for
-you. Three phases:
+files, a config-driven pipeline), `/lc-from-code` does the wrapping
+for you. Three phases:
 
 1. **Scan.** A subagent reads every script and notebook and returns a
    structured inventory: what each script reads, writes, and contains
@@ -107,9 +72,60 @@ you. Three phases:
    identified decisions, leaves the actual analytical logic alone, and
    iterates on `lc run` until everything materializes.
 
-The hard rule of `/lc-migrate` is **minimal changes**: the skill never
-refactors, renames, or "improves" your code. It only adds the parameter
-plumbing.
+The hard rule of `/lc-from-code` is **minimal changes**: the skill
+never refactors, renames, or "improves" your code. It only adds the
+parameter plumbing.
+
+## `/lc-from-paper` — reproduce a published paper
+
+**You have a DOI or arXiv ID. You end with a reproduction project
+driven by an ORIENT-first agent that hands off to a long-running
+ralph loop for the heavy middle.**
+
+`/lc-from-paper` is the entry point of the paper-reproduction bundle.
+It opens with **ORIENT** — one pre-loop phase in your main session
+that runs in seven stages: ask for the paper, run `/paper-extraction`
+inline (so subsequent questions are grounded in the actual paper),
+interview you (scope, fidelity intent — your prose answer to "when is
+this good enough" — code repo confirmation, paper-specific
+conventions, prior familiarity, external context), clone the
+reference code and run `/lc-from-code` scan-only (when a repo exists),
+optionally follow up, then draft **two files** at the workdir root:
+`constitution.md` (the ralph loop's driving document — Goal, fidelity
+intent, scope, quality bar, evidence) and `CLAUDE.md` (the auto-loading
+walk-up with rules, the paper-vs-code disagreements log, open
+opportunities). You review the drafts, then a single first commit
+captures `constitution.md` + `CLAUDE.md` + the full `work/reference/`
+substrate.
+
+After ORIENT lands, the skill launches a **ralph loop** in a detached
+tmux session against `constitution.md`. Each iteration starts a fresh
+worker that surveys the workdir, picks the next valuable move
+(typically one of ARCHITECT → SPECIFY → LITERATURE → IMPLEMENT → RUN
+→ COMPARE), does it, commits, exits. The fresh-context property
+between iterations is what makes per-phase review work: iteration N
+writes, iteration N+1 reads N's work without bias. You attach to the
+loop with `tmux attach` to watch or steer; iterations are detached so
+they can't ask you questions interactively — they log open questions
+to `open-questions.md` with a best-judgment default and the loop
+keeps moving.
+
+When the loop closes (constitution `status: closed` after COMPARE
+returns `pass` and a cold-survey iteration finds nothing left to
+improve), come back and the agent runs **REVIEW close-out** in your
+session: `/figure-comparison` against the targets, optional
+`/check-sentence-by-sentence`, a walk through the accumulated open
+questions, a `REPRODUCTION-SUMMARY.md`. COMPARE's opportunity
+assessment — where the gaps are, how much they likely matter, and how
+they sit relative to your fidelity intent — propagates into
+CLAUDE.md's *Open opportunities* list as the trajectory of what could
+be tightened on a return visit.
+
+The bundle composes sibling skills: `ralph` (the loop substrate),
+`paper-extraction`, `narrative`, `figure-comparison`, and
+`check-sentence-by-sentence`. See
+[`claude/lightcone/skills/README.md`](https://github.com/LightconeResearch/lightcone-cli/blob/main/claude/lightcone/skills/README.md)
+for the full bundle map.
 
 ## `/lc-feedback` — file an issue without context-switching
 
@@ -133,5 +149,6 @@ interruptible — every phase writes to disk so a `/clear` (which frees
 up context) doesn't lose your work.
 
 If a skill seems stuck, a quick `/clear` followed by reinvoking the
-slash command is often the right move: the spec, plan, and universe
-files are all on disk, so the agent picks up exactly where it left off.
+slash command is often the right move: the spec, universe files, and
+written work products are all on disk, so the agent can pick up where
+it left off.
diff --git a/docs/user/getting-started.md b/docs/user/getting-started.md
index fc9ccaf4..e2968e9c 100644
--- a/docs/user/getting-started.md
+++ b/docs/user/getting-started.md
@@ -55,18 +55,25 @@ can edit it by hand whenever you want.
 That opens an interactive session inside `my-analysis/`. Claude Code
 reads `astra.yaml` and `CLAUDE.md` so it has context.
 
-## 4. The five slash commands
+## 4. The slash commands
 
-Inside Claude Code:
+Inside Claude Code. The `/lc-from-*` family is parallel by what you
+start from — a question, code, or a paper — and `/lc-feedback` handles
+bug reports without leaving the session.
 
 | Command | Use it when… |
 |---------|--------------|
 | `/lc-new` | You're starting from a research question and an empty `astra.yaml`. |
-| `/lc-build` | You have a scoped `astra.yaml` and you want the analysis implemented and run. |
-| `/lc-verify` | You finished a build and want a read-only audit. |
-| `/lc-migrate` | You have an existing codebase you want wrapped in ASTRA. |
+| `/lc-from-code` | You have an existing codebase you want wrapped in ASTRA. |
+| `/lc-from-paper` | You have a published paper (DOI / arXiv ID) you want to reproduce. |
 | `/lc-feedback` | Something broke and you want to file a GitHub issue without leaving the session. |
 
+These are structured entry points for common starting situations. You
+don't have to use them — once you're inside a project, you can also
+just describe what you're trying to do to Claude. `astra.yaml`,
+`lc run`, and `lc verify` keep things tracked regardless of how you
+got there.
+
 The next page, [The Agentic Workflow](agent-workflow.md),
 explains each of these in more detail.
 
diff --git a/docs/user/glossary.md b/docs/user/glossary.md
index be072f4c..529bfd1d 100644
--- a/docs/user/glossary.md
+++ b/docs/user/glossary.md
@@ -139,18 +139,20 @@ node launched via `srun`.
 
 ## Skill
 
-A Claude Code slash command bundled with the lightcone-cli plugin
-(`/lc-new`, `/lc-build`, `/lc-verify`, `/lc-migrate`,
-`/lc-feedback`). Each one is a structured prompt that drives the
-agent through a specific phased workflow.
+A Claude Code slash command bundled with the lightcone-cli plugin.
+The `/lc-from-*` family is parallel by what you start from — a question
+(`/lc-new`), code (`/lc-from-code`), or a paper
+(`/lc-from-paper`). `/lc-feedback` files upstream issues from inside
+the session. Each one is a structured prompt that drives the agent
+through a specific phased workflow.
 
 ## Subagent
 
 A Claude Code agent invoked by another agent via the `Task` tool. The
 `lc-extractor` subagent reads PDFs and pulls verifiable quotes; it's
-spawned by `/lc-new` during the literature deep-dive phase. Subagents
-have isolated context, which is why `/lc-new` uses one per paper —
-PDFs are big.
+spawned by `/lc-new` during the literature deep-dive phase.
+Subagents have isolated context, which is why `/lc-new` uses
+one per paper — PDFs are big.
 
 ## Prior insight
 
@@ -190,12 +192,14 @@ The three labels `lc verify` produces when something's wrong:
 
 ## Ralph loop
 
-The autonomous build loop driven by `/lc-build`. Each iteration:
-survey state, decide what to do next, write/run code, commit, exit.
-The Claude Code stop hook re-injects the loop prompt until the agent
-emits `BUILD_COMPLETE` or hits its iteration limit. State persists
-across crashes in `.claude/ralph-loop.local.md`. Cancel with
-`/cancel-ralph`.
+A reusable autonomous iteration pattern for long-running agent work.
+Each iteration surveys state, decides what to do next, writes or runs
+code, commits, and exits. A bundled tmux runner spawns a fresh worker
+per iteration with the *constitution* — a markdown file describing what
+"done" looks like — as system prompt; the constitution stays editable
+across iterations. Stop the loop by setting `status: closed` in the
+constitution's frontmatter (the next iteration sees it and exits) or by
+killing the tmux session.
 
 ## Permission tier
 
diff --git a/docs/user/index.md b/docs/user/index.md
index a8612837..24751b7a 100644
--- a/docs/user/index.md
+++ b/docs/user/index.md
@@ -16,7 +16,7 @@ implementation; **you stay in charge of the scientific choices**.
 - [Getting Started](getting-started.md) — your first `lc init` and
   what every directory means.
 - [The Agentic Workflow](agent-workflow.md) — `/lc-new`,
-  `/lc-build`, `/lc-verify`, `/lc-migrate`, `/lc-feedback` — what each
+  `/lc-from-code`, `/lc-from-paper`, and `/lc-feedback` — what each
   one does and when to reach for it.
 - [Tutorial: Your First Analysis](tutorial.md) — an end-to-end worked
   example, written so you can read it without running anything.
diff --git a/docs/user/install.md b/docs/user/install.md
index 66e69228..66820742 100644
--- a/docs/user/install.md
+++ b/docs/user/install.md
@@ -78,8 +78,9 @@ Open a project in your terminal or editor (see [Getting Started](getting-started
 
     claude
 
-Inside Claude Code you will have dedicated lightcone CLI slash commands available like `/lc-new` and
-`/lc-build` — see [The Agentic Workflow](agent-workflow.md).
+Inside Claude Code you'll type slash commands like `/lc-new`,
+`/lc-from-code`, and `/lc-from-paper` — see
+[The Agentic Workflow](agent-workflow.md).
 
 ## 5. (Optional) Docker or Podman
 
diff --git a/docs/user/nersc.md b/docs/user/nersc.md
index d35a2c05..60331c6b 100644
--- a/docs/user/nersc.md
+++ b/docs/user/nersc.md
@@ -158,9 +158,9 @@ Once your agent CLI is open (Claude Code in this guide's examples), drive everyt
     /lc-new Please sample a standard Gaussian distribution using numpy.
     ```
 
-=== "Migrate existing code"
+=== "Port existing code"
     ```text
-    /lc-migrate I have code that samples a standard Gaussian distribution using numpy at @../gaussian_sampling. Please create an analysis based on it.
+    /lc-from-code I have code that samples a standard Gaussian distribution using numpy at @../gaussian_sampling. Please create an analysis based on it.
     ```
 
 After that, just keep talking to the agent in plain English about what you want to build next.
diff --git a/docs/user/troubleshooting.md b/docs/user/troubleshooting.md
index 194d8dd3..b7978b3a 100644
--- a/docs/user/troubleshooting.md
+++ b/docs/user/troubleshooting.md
@@ -110,30 +110,6 @@ no longer exists. Usually caused by:
 
 Fix: `lc run` the downstream output. The chain will re-anchor.
 
-## "Active lc-build loop detected"
-
-You're picking up a session where a previous `/lc-build` was
-interrupted. The session-start hook prints this in the banner. To
-resume the loop, run `/lc-build --universe <name>`. To cancel it,
-`/cancel-ralph`.
-
-## The build loop runs forever / never says complete
-
-`/lc-build` defaults to a 25-iteration cap. If it's not making
-progress, that's a sign the analysis hit a real problem the agent
-can't resolve on its own — typically a missing dependency, an
-unparseable error, or a step that needs a human decision.
-
-What helps:
-
-- Read the last few iterations carefully — the agent usually
-  describes the blocker.
-- If there's an "open question" the agent flagged, answer it and
-  reinvoke `/lc-build`. The plan file persists; the loop picks up
-  where it left off.
-- A `/clear` followed by `/lc-build` doesn't lose state — only
-  context.
-
 ## Claude Code says it can't write a file
 
 The default permission tier (`recommended`) blocks edits to a few
diff --git a/docs/user/tutorial.md b/docs/user/tutorial.md
index 049b4bf2..c5e8d67e 100644
--- a/docs/user/tutorial.md
+++ b/docs/user/tutorial.md
@@ -101,70 +101,59 @@ Phase 4 (**FINALIZE**) runs `astra validate astra.yaml`, writes
 back a short summary table — two outputs, one decision, zero prior
 insights.
 
-The agent suggests `/clear` to free up context, then `/lc-build`. Take
-its advice.
+The agent may suggest `/clear` to free up context. Take its advice,
+then ask Claude Code to implement the spec.
 
-## 3. Build it with `/lc-build`
+## 3. Build it
 
 ```text
 /clear
-/lc-build
+Implement this analysis from astra.yaml. Write the scripts, run the baseline universe, and verify the result.
 ```
 
-**Phase 1: plan.** The agent reads everything (spec, universe file,
-empty `scripts/` dir, the references in `.claude/guides/`) and writes a
-build plan to `.lightcone/plans/build-plan-baseline.md`. It might look
-like this:
+The agent reads everything (spec, universe file, empty `scripts/` dir,
+plus the `/astra` and `/lc-cli` reference skills primed at session
+start) and makes an implementation checklist. It might look like this:
 
 ```text
 1. Add Python deps (scikit-learn, matplotlib) to requirements.txt
 2. Write Containerfile if missing
 3. scripts/fit.py — accepts --standardize {standardized,raw}, writes r2.json
 4. scripts/plot.py — reads r2_dir, writes fit_plot.png
-5. lc build to build the container
-6. lc run --universe baseline
-7. /lc-verify
+5. lc run --universe baseline
+6. lc status
+7. astra validate astra.yaml
+8. lc verify
 ```
 
-It asks you to approve. Pick "Approve and start building."
+It works through the checklist one item at a time. You'll see commands
+like:
 
-**Phase 2: loop.** The agent works through the plan one item at a
-time. You'll see lines like:
-
-```text
-▶ scripts/fit.py — writing
-▶ lc build — building image lc-r2-decision-demo-9a1f3...
-▶ lc run accuracy --universe baseline
-▶ ✓ ok    r2
-▶ ▶ scripts/plot.py — writing
-▶ ✓ ok    fit_plot
-✓ build complete
+```bash
+lc run --universe baseline
+lc status
+astra validate astra.yaml
+lc verify
 ```
 
-The agent commits after each successful output, so your `git log` is a
-clean record of the build.
+Expected `lc status` output:
 
-## 4. Verify it with `/lc-verify`
-
-```text
-/lc-verify
 ```
-
-Read-only audit:
-
-```text
-| Check                    | Status |
-|--------------------------|--------|
-| Spec validation          | ✓      |
-| Materialization (2/2)    | ✓      |
-| Decision-code alignment  | ✓      |
-| Results match spec (2/2) | ✓      |
+Universe baseline
+  ✓ ok    r2
+  ✓ ok    fit_plot
 ```
 
-If anything fails, the agent suggests a fix. Re-run `/lc-build` or fix
-by hand.
+Expected validation and verification output is boring in the best way:
+`astra validate astra.yaml` exits cleanly, and `lc verify` reports no
+tampering, broken provenance chain, or missing manifests. If anything
+fails, ask the agent to fix the concrete error and rerun the same
+commands.
+
+The agent commits after each successful output, so your `git log` is a
+clean record of the build.
 
-## 5. Add the second universe
+## 4. Add the second universe
 
 The whole point of decisions is to sweep them. Drop out of Claude
 Code (`Ctrl+D` or `/exit`) and create the second universe:
@@ -195,7 +184,7 @@ Universe raw
 Each universe has its own `results/<universe>/` tree. The two `r2.json`
 files are the comparison your paper figure needs.
 
-## 6. Verify integrity
+## 5. Verify integrity
 
 ```bash
 lc verify
diff --git a/src/lightcone/cli/commands.py b/src/lightcone/cli/commands.py
index 8cd9b1da..c1870711 100644
--- a/src/lightcone/cli/commands.py
+++ b/src/lightcone/cli/commands.py
@@ -300,7 +300,11 @@ def init(
     console.print("\nNext steps:")
     console.print(f"  • Go to the newly created directory [cyan]cd {directory}[/cyan]")
     console.print("  • Start [cyan]claude[/cyan]")
-    console.print("  • Run [cyan]/lc-new[/cyan] to get started on a new analysis")
+    console.print(
+        "  • Run [cyan]/lc-new[/cyan] to scope a new analysis, "
+        "[cyan]/lc-from-code[/cyan] to port existing code, "
+        "or [cyan]/lc-from-paper[/cyan] to reproduce a paper"
+    )
 
 
 _CONTAINERFILE = """\
@@ -335,9 +339,17 @@ def init(
 
 _PROJECT_CLAUDE_MD = """# Project Notes for Claude
 
-This is an ASTRA project orchestrated by `lightcone-cli`.
+This is an ASTRA project orchestrated by `lightcone-cli`. It was just
+scaffolded by `lc init` and has not been scoped yet — `astra.yaml` holds
+the placeholder example, not real science.
+
+The three entry skills cover the common starting points:
+
+- `/lc-new` — scope from a research question (empty `astra.yaml`).
+- `/lc-from-code` — wrap an existing codebase in ASTRA.
+- `/lc-from-paper` — reproduce a published paper end-to-end.
 
-To materialize outputs declared in `astra.yaml`:
+Once scoped, the `lc` CLI keeps the substrate in sync:
 
 ```
 lc run                    # all outputs in the default universe
diff --git a/src/lightcone/eval/harness.py b/src/lightcone/eval/harness.py
index e0343f5e..a9f1b58a 100644
--- a/src/lightcone/eval/harness.py
+++ b/src/lightcone/eval/harness.py
@@ -31,9 +31,9 @@
 DEFAULT_LOOP_PROMPT = """\
 Build the analysis specified in `astra.yaml` for universe `{{UNIVERSE}}`.
 
-Read `.claude/guides/lightcone-cli-reference.md` for the workflow and \
-`.claude/guides/astra-reference.md` for spec syntax. Then for each output \
-that needs materializing:
+Invoke the `/lc-cli` skill for the lc workflow (spec-code invariant, status \
+interpretation, failure diagnosis) and `/astra` for spec syntax (decisions, \
+inputs/outputs, sub-analyses). Then for each output that needs materializing:
 
 1. Read the recipe's `command` to see what script and arguments it expects.
 2. Write the script under `src/`, parameterizing every decision via argparse \
diff --git a/tests/test_paper_extraction_caption.py b/tests/test_paper_extraction_caption.py
new file mode 100644
index 00000000..579f8d69
--- /dev/null
+++ b/tests/test_paper_extraction_caption.py
@@ -0,0 +1,24 @@
+"""Tests for paper-extraction caption parsing."""
+from __future__ import annotations
+
+from importlib.util import module_from_spec, spec_from_file_location
+from pathlib import Path
+
+SCRIPT_PATH = (
+    Path(__file__).resolve().parents[1]
+    / "claude"
+    / "lightcone"
+    / "skills"
+    / "paper-extraction"
+    / "scripts"
+    / "extract-paper-substrate.py"
+)
+_SPEC = spec_from_file_location("paper_extraction_extract_script", SCRIPT_PATH)
+assert _SPEC is not None and _SPEC.loader is not None
+_SCRIPT = module_from_spec(_SPEC)
+_SPEC.loader.exec_module(_SCRIPT)
+
+
+def test_extract_caption_handles_nested_braces_and_last_nonempty_caption() -> None:
+    text = r"\caption{}\caption{X $A^{\mathrm{Y}}$ Z}"
+    assert _SCRIPT.extract_caption(text, {}) == r"X $A^{\mathrm{Y}}$ Z"
diff --git a/zensical.toml b/zensical.toml
index 019583f6..97b20912 100644
--- a/zensical.toml
+++ b/zensical.toml
@@ -47,10 +47,14 @@ nav = [
     {"Skills" = [
       {"Overview" = "skills/index.md"},
       {"lc-new" = "skills/lc-new.md"},
-      {"lc-build" = "skills/lc-build.md"},
-      {"lc-verify" = "skills/lc-verify.md"},
-      {"lc-migrate" = "skills/lc-migrate.md"},
+      {"lc-from-code" = "skills/lc-from-code.md"},
+      {"lc-from-paper" = "skills/lc-from-paper.md"},
       {"lc-feedback" = "skills/lc-feedback.md"},
+      {"ralph" = "skills/ralph.md"},
+      {"paper-extraction" = "skills/paper-extraction.md"},
+      {"narrative" = "skills/narrative.md"},
+      {"figure-comparison" = "skills/figure-comparison.md"},
+      {"check-sentence-by-sentence" = "skills/check-sentence-by-sentence.md"},
       {"Authoring Skills" = "skills/authoring.md"},
     ]},
     {"Contributing" = [