Skip to content

CLI: extend --only {files,sessions[,both?]} to csb list and csb scan #30

@djdarcy

Description

@djdarcy

Extend --only {files,sessions[,both?]} flag to csb list and csb scan

Problem

v0.3.5 consolidated csb search's output-mode flags from --files-only + --sessions-only into a single --only {files,sessions} choice. The vocabulary is cleaner and the mutex with --json is now explicit in argparse. But it only exists on csb searchcsb list and csb scan still produce session-shaped output with no equivalent "collapse to file paths" mode, even though that's a real and useful operation.

Concretely, a user who learns this:

csb search "needle" --only files
# C:\...\.convo_<session-name>_<uuid>_<user>.log
# C:\...\.convo_<other-session>_..._<user>.log

…can't apply the same trick to scan or list:

csb scan -d ./foo --only files          # ERROR: unrecognized arguments: --only
csb list --only files                    # ERROR: unrecognized arguments: --only

So they fall back to ad-hoc shell pipelines or --json | jq parsing to extract paths — workable, but it violates the "learn the vocabulary once, apply it anywhere" principle that CLI design should aim for.

Proposed solution

Extend --only to csb list and csb scan so the same flag does the same thing across all three list-producing commands. Three design options below; the user's "both" idea is option C and is the new contribution worth thinking through.

Option A — Strict vocabulary parity:

csb list --only sessions    # no-op, accepted for vocabulary parity (default behavior)
csb list --only files       # one-line-per-transcript-path output
csb scan -d ./foo --only files

Same flag everywhere. sessions is a no-op on list/scan since their default IS sessions, but it's accepted so users don't have to remember "this command supports the flag, that one doesn't."

Option B — Narrow:

csb list --only files       # one-line-per-transcript-path output
csb list --only sessions    # ERROR: redundant; list is already session-shaped

Only --only files exists on list/scan. Drops the no-op sessions choice. Every flag does something, but vocabulary is asymmetric (search has 2 choices, list/scan has 1).

Option C — Add both for session-to-file correlation:

csb list --only both --sort messages
# CLAUDE-SESSION-BACKUP__... 7250ddce-...  (proj)  3073 messages  ->  C:\...\.convo_..._<user>.log
# big-other-session ...                    ->  C:\...\.convo_...
csb scan -d ./foo --only both
csb search "needle" --only both

Each session row is annotated with its transcript path, useful for awk/paste-style correlation or for "show me which file backs each session." Solves the question "which file on disk is this session?" — currently requires csb show <uuid> per session, or --json | jq '.[]|"\(.session_id)\t\(.path)"'.

Use cases the extension enables

# Back up soon-to-purge sessions
csb list --only files --sort expiration > tar-me.txt
tar czf backup.tar.gz -T tar-me.txt

# Grep into JSONLs of sessions that worked in ./foo
csb scan -d ./foo --only files | xargs grep -l "pattern"

# Find every session whose transcript lives at a weird path (audit naming)
csb list --only files --shortid

# (C only) Correlate session identity with file on disk in one shot
csb list --only both --sort messages

Design considerations

Vocabulary consistency principle. A user who learns --only files on one command should expect the same behavior elsewhere. Today's surface violates this — only csb search has --only. Option A reinforces the principle most cleanly; B compromises slightly; C extends it productively.

Transcript path resolution. Hit.transcript_path and _best_transcript_path() were added in v0.3.5 (preference: convo > sesslog > jsonl, with absolute-path fallback resolved against claude_dir). The same helper is reusable for list/scan — just call it per session row before rendering.

--sort interaction. --only files on csb list should honor --sort — the output order is whatever csb list would have shown. Same for scan.

--json mutex. csb list already has --json. csb scan does not (today). For Option A/B/C we'd want a mutex group --json xor --only on list, and decide whether scan needs --json for symmetry (out of scope of this issue but worth tracking).

No-op --only sessions on list/scan (Option A only). Two sub-options for the no-op case:

  • Silently accept and produce normal output (vocabulary parity wins)
  • Raise "redundant flag" error (every-flag-does-something wins)
    Silently accepting feels less hostile.

Output format for both (Option C only). Three candidates:

  • Plain TSV: <session_name>\t<session_id>\t<transcript_path> — pipeable, awk-friendly
  • Two-line per row: <existing session header>\n -> <transcript_path> — human-readable
  • Aligned columns: tabular layout with explicit Path column
    Probably the TSV form for pipelines + a --shortid-shaped tweak for the human-readable variant. Needs spec work.

Performance. Each session lookup adds one session_sources SELECT. For csb list -n 20 that's 20 queries — fine. For large list outputs (--all or -n 10000) we'd want a single JOIN-based query or per-batch lookup. Worth measuring.

Implementation approach

Phased:

Phase 1 — Option A (strict parity), no both:

  • cli.py: add --only {files,sessions} to p_list and p_scan, mutex with existing --json where present.
  • commands.py: cmd_list and cmd_scan check args.only; if files, resolve transcript_path per row and bypass the table renderer.
  • timeline.py and scan's renderer: add a "files only" output mode that prints transcript_path one per line.
  • Reuse _best_transcript_path from search.py (consider hoisting it to a shared module).
  • Tests: parser-level (flag accepted, mutex enforced) + integration (output is one path per session, order matches --sort).

Phase 2 — Option C (both mode):

  • Add both to the choices list for --only on all three commands.
  • Design the output format. Recommend TSV by default with an --only both --tabular modifier if humans need it pretty-printed.
  • Renderer touches for human + sessions modes in search; new code for list/scan.
  • Tests including --sort interaction.

Phase 1 ships the consistency win cheaply. Phase 2 ships the new feature once we've used Phase 1 enough to know what both's output should feel like.

Acceptance criteria

  • csb list --only files produces one transcript path per session, in --sort order
  • csb scan -d <path> --only files produces one transcript path per matching session
  • csb list --only files and csb list --json are mutually exclusive (argparse error if both)
  • --only files reuses _best_transcript_path (no duplicate resolution logic)
  • csb list --only sessions either silently equals the default OR raises a clear error (decision logged in PR)
  • Existing csb search --only {files,sessions} behavior unchanged
  • Tests cover argparse parsing, output format, --sort interaction, and the mutex
  • Hand-runnable checklist documents the new modes for all three commands
  • CHANGELOG entry describes the consistency change + use cases
  • (Phase 2 only) --only both output format documented in csb <cmd> --help and in a hand checklist

Related issues

Analysis

Deferred from the v0.3.5 patch on 2026-05-20 to avoid scope creep. See the v0.3.5 commit and v0.3.5__Feature__search-directory-scope-and-min-strength.md checklist for the existing --only shape on csb search. The Hit.transcript_path resolution helper (_best_transcript_path) in search.py is the reusable piece for any extension.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions