Skip to content

Reading-intake friction fixes (6 real-usage items)#7

Open
hele211 wants to merge 9 commits into
mainfrom
feat/reading-intake-friction-fixes
Open

Reading-intake friction fixes (6 real-usage items)#7
hele211 wants to merge 9 commits into
mainfrom
feat/reading-intake-friction-fixes

Conversation

@hele211

@hele211 hele211 commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Why

Six friction points surfaced while analysing a paper end-to-end with CrickNote. Each was traced to source, discussed into a design, and fixed test-first. Design spec: docs/superpowers/specs/2026-06-13-reading-intake-friction-fixes-design.md.

What's fixed

# Friction Fix
#3 Logs polluted stdout, breaking json.load All log levels → stderr; stdout is now a pure data channel
#1 A 29k-token paper was cut off mid-Results Token caps 10k/30k → 50k, single source can use the full budget, PDF page limit 20 → 80
#4b Missing slug reported as Invalid slug format Distinct slug is required vs Invalid slug format: …
#2 Documented Zotero path made an un-compilable (sourceless) note Skill points at ingest_reading_bundle; create_reading_note auto-discovers an existing bundle; discoverBundle extracted to a shared module
#6 No way to locate figures in compiled text --- page N --- markers between PDF pages (no content removed)
#5 Had to reproduce full frontmatter just to save analysis New vault_write_body — preserves frontmatter verbatim, agent supplies only the body

Plus a latent path-coupling bug: zotero_prepare_bundle writes to config.zotero.vault_pdf_dir, but discover/ingest/compile hardcoded Reading/attachments. They only aligned on the default. Now threaded through consistently (resolveAttachmentsDir in build-registry), backward-compatible, with unit + end-to-end integration coverage.

Deferred (see spec §4)

  • Pagination (offset/max_tokens) — the 50k cap fits whole papers; only needed for >50k monsters.
  • zotero_intake one-call wrapper — seams are now painless; it's the biggest/riskiest build for pure convenience.

Testing

  • Test-first throughout (RED→GREEN), atomic commits per item.
  • Full suite 509 → 525 tests, typecheck + build clean at every step.
  • Verified through the compiled CLI: node dist/cli.js tools 2>/dev/null parses as clean JSON; vault_write_body + ingest_reading_bundle registered.

🤖 Generated with Claude Code

hele211 and others added 9 commits June 14, 2026 07:05
Six real-usage friction points in the reading-intake pipeline, verified
against source and resolved through brainstorming:

1. compile cap too low (10k/source, 30k/session) + hidden 20-page limit
2. documented Zotero path creates an un-compilable (sourceless) note
3. logs contaminate stdout JSON
4. zotero_prepare_bundle ergonomics + missing-vs-malformed slug error
5. no body-only / section-level write
6. no page markers to locate figures during Figure Map drafting

Decisions: fix the seams + add a thin zotero_intake wrapper; raise caps
with offset/max_tokens escape hatch; page markers now, noise-stripping
deferred; body-only write; defensive create_reading_note.

Includes a critical-review section: logger test rewrite scope,
discoverBundle extraction, page-marker feasibility, Phase 4 testability,
and a latent vault_pdf_dir path-coupling bug (guarded in Phase 1d).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Non-error logs were written to stdout — the same stream the CLI uses for
its JSON result — so every captured tool call had an `INFO [component]`
line prepended, breaking json.load. Route every level to stderr; stdout
is now exclusively the result channel. The optional log file still
captures all levels.

Tests rewritten to assert stderr for all levels and that stdout is never
touched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A 29k-token paper was cut off mid-Results: the per-source cap was 10k
tokens while the session cap was 30k, and pdf-parse only read the first
20 pages. Raise the session budget to 50k, let a single source draw the
full budget (no per-source ceiling below it), and lift the PDF page limit
to 80. Papers that exceed even this are handled by pagination (Phase 2).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
zotero_prepare_bundle (and zotero_cleanup_bundle) returned "Invalid slug
format" for a missing slug, which reads as malformed rather than absent.
Add a shared validateSlugArg helper: "slug is required." when empty,
"Invalid slug format: ..." with the offending value otherwise.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The documented Zotero path (fetch → prepare_bundle → create_reading_note)
produced a note with no `sources:` because create_reading_note only sets
sources when explicitly passed — so compile_reading_note returned
sources_missing. Repoint both the Zotero and local-files paths in the
skill to ingest_reading_bundle, which auto-discovers the bundle and
registers sources.

Also harden create_reading_note: when `sources` is omitted and a bundle
folder exists for the slug, auto-discover its files. Its no-bundle
placeholder capability (a note before any files exist) is preserved.

Extract discoverBundle into src/knowledge/reading-bundle.ts so it is
shared by reading-intake and templates instead of duplicated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract PDFs per page (custom pagerender mirroring pdf-parse's default
item-join) and insert `--- page N ---` markers between pages. No content
is removed — this just helps locate figures when drafting the Figure Map.
Skill notes the markers so the agent records which page each figure is on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Filling in a reading note's analysis previously required vault_write with
the entire file, forcing the agent to reproduce the folded-YAML title,
author list, and sources block exactly. vault_write_body matches the
leading frontmatter block verbatim (no re-serialization) and replaces
only the body, so the agent supplies just the sections. Errors if the
file is missing or has no frontmatter. Skill updated to use it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phases 1, 2b, 3 shipped. Pagination (2a) deferred — the 50k cap now fits
whole papers. zotero_intake (4) deferred — seams are fixed so the manual
chain is painless; it's the largest/riskiest build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
zotero_prepare_bundle writes PDFs to config.zotero.vault_pdf_dir/<slug>,
but discover/ingest/compile hardcoded Reading/attachments/<slug>. They
aligned only because vault_pdf_dir defaults to Reading/attachments, so a
non-default config silently produced sourceless notes.

Thread an attachmentsDir (resolved from config.zotero.vault_pdf_dir in
build-registry) through discoverBundle, loadSources, createKbTools,
createReadingIntakeTools, and createTemplateTools. zotero_prepare_bundle
error messages now reference the actual bundle path. Backward-compatible:
the default dir produces identical paths and messages.

Adds reading-bundle unit tests, a source-loader custom-dir test, and an
end-to-end integration test (ingest discovers + compile loads from a
non-default dir).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant