You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The paper2astra → lightcone-cli skill migration (lightcone-cli#86) preserves all functionality but surfaced an ingestion gap: the original parse_paper.py produced structured figure/table metadata that the new skill doesn't reproduce, and there's no input-side mechanism for the COMPARE phase to verify reproduction outputs against the paper's claimed numerical results.
This is complementary to #81 (substrate convergence). #81 picks the parser; this issue specifies what comes out of it.
Proposal
astra paper add <DOI> produces, under work/reference/:
paper.pdf # original
paper.tex # main source (when arXiv source available)
figures/ # files copied from LaTeX source where available
# metadata.json: caption, label, page, source-path
# Docling render-and-crop only as PDF-only fallback (per #81)
tables/ # \begin{table} blocks extracted by label, with caption
findings.yaml # the paper's own numerical findings (see below)
code/ # cloned reference repo when available (Zenodo / GitHub link)
Every consumer (paper2astra, standalone reproductions, commentary tools) trusts these artifacts blindly. paper2astra stops caring how the paper info was obtained.
The findings artifact — input-side back-pressure
The most interesting piece. Today, COMPARE phases of reproductions rely on agents eyeballing figures and reading prose to decide whether their numerical results match the paper's. Without a structured surface to diff against, agents drift toward plausible-but-wrong values. (Nolan's figure-comparison skill in lightcone-cli#86 helps for figures; check-sentence-by-sentence helps for prose. There's no structured numerical surface yet.)
A structured ledger of the paper's published values would be a complementary forcing function. Example shape:
ASTRA semantics: these are the paper's own findings: (what that paper claims). When a reproduction's astra.yaml references them, it treats them as prior_insights: from its perspective. Same data, different roles — no tautology because the data lives in the ingestion artifact, not in the reproduction's spec.
COMPARE then iterates against a structured ledger:
for finding in findings.yaml:
locate matching value in reproduction outputs
diff and log to comparison-report.md
This mirrors the output-side LaTeX-macro pattern (\newcommand{\Omegam}{0.315} so the rendered paper can't drift from the analysis): an input-side analogue where the paper's claims are extracted into a structured ledger and the reproduction can't claim convergence without matching them.
Two extraction paths
Author-defined \newcommand macros — regex over the LaTeX source. Free, scriptable. Realistic estimate: most papers don't define these; we'll build our own variable set per-paper.
Inline values like $\Omega_m = 0.315 \pm 0.007$ — agent-driven during ACQUIRE. Imperfect but tractable; iterating on the prompt and validation rules will get most of the way.
Open questions
Format for findings. ASTRA-shape ({claim, evidence} entries under findings:) gives downstream consistency — astra paper add only ever emits valid ASTRA YAML, MySTRA renders paper findings with the same machinery as reproduction findings. Bespoke flat schema ([{name, value, error, source, quote}]) is more concise for COMPARE iteration. Agent-side back-pressure is roughly equivalent either way; argument for ASTRA-shape is downstream consistency.
Tables — is metadata enough? A \begin{table} block with caption, by label, is enough for an agent to read. No need for extracted values as separate JSON; the LaTeX source already has the values.
lightcone-cli#86 — the migration that surfaced this gap; includes Nolan's figure-comparison and check-sentence-by-sentence skills as adjacent forcing functions on different surfaces.
Paper2ASTRA#10 — the design doc that motivated the migration.
Background
The paper2astra → lightcone-cli skill migration (lightcone-cli#86) preserves all functionality but surfaced an ingestion gap: the original
parse_paper.pyproduced structured figure/table metadata that the new skill doesn't reproduce, and there's no input-side mechanism for the COMPARE phase to verify reproduction outputs against the paper's claimed numerical results.This is complementary to #81 (substrate convergence). #81 picks the parser; this issue specifies what comes out of it.
Proposal
astra paper add <DOI>produces, underwork/reference/:Every consumer (paper2astra, standalone reproductions, commentary tools) trusts these artifacts blindly. paper2astra stops caring how the paper info was obtained.
The findings artifact — input-side back-pressure
The most interesting piece. Today, COMPARE phases of reproductions rely on agents eyeballing figures and reading prose to decide whether their numerical results match the paper's. Without a structured surface to diff against, agents drift toward plausible-but-wrong values. (Nolan's
figure-comparisonskill in lightcone-cli#86 helps for figures;check-sentence-by-sentencehelps for prose. There's no structured numerical surface yet.)A structured ledger of the paper's published values would be a complementary forcing function. Example shape:
ASTRA semantics: these are the paper's own
findings:(what that paper claims). When a reproduction'sastra.yamlreferences them, it treats them asprior_insights:from its perspective. Same data, different roles — no tautology because the data lives in the ingestion artifact, not in the reproduction's spec.COMPARE then iterates against a structured ledger:
This mirrors the output-side LaTeX-macro pattern (
\newcommand{\Omegam}{0.315}so the rendered paper can't drift from the analysis): an input-side analogue where the paper's claims are extracted into a structured ledger and the reproduction can't claim convergence without matching them.Two extraction paths
\newcommandmacros — regex over the LaTeX source. Free, scriptable. Realistic estimate: most papers don't define these; we'll build our own variable set per-paper.$\Omega_m = 0.315 \pm 0.007$— agent-driven during ACQUIRE. Imperfect but tractable; iterating on the prompt and validation rules will get most of the way.Open questions
{claim, evidence}entries underfindings:) gives downstream consistency —astra paper addonly ever emits valid ASTRA YAML, MySTRA renders paper findings with the same machinery as reproduction findings. Bespoke flat schema ([{name, value, error, source, quote}]) is more concise for COMPARE iteration. Agent-side back-pressure is roughly equivalent either way; argument for ASTRA-shape is downstream consistency.\begin{table}block with caption, by label, is enough for an agent to read. No need for extracted values as separate JSON; the LaTeX source already has the values.Cross-references
figure-comparisonandcheck-sentence-by-sentenceskills as adjacent forcing functions on different surfaces.Suggested labels
area:paper-management,enhancement,discuss-before-doing— Claude on behalf of Cail