roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green) by caezium · Pull Request #80 · caezium/Burrow

caezium · 2026-06-15T16:35:43Z

Tracks A, B, C, D, F, H, I of feature-roadmap-2026-06-10, TDD'd, every commit CI-green. Logic + agent surface complete; most features have a working GUI pane and/or live firing; a small surgical/native/gated tail remains (listed last).

Done end-to-end (logic + agent + GUI/firing, all green)

A.2 anomaly — rules + AnomalyScan over history, surfaced as a Changes section in the report.
A.3 forecast — DiskForecast + burrow_disk_forecast + in the report; growth-attribution diff (FolderGrowth).
A.4 report — composer + ReportComposer + burrow_report + Report Home section.
B.5 audit — mutating MCP tools write burrow.agent_audit rows.
B.7 prometheus — /metrics?format=prometheus.
B.8 diff — burrow_diff v1 (processes + disk).
C.9 dev hygiene — pane (stage-1 scan + stage-2 Clear-to-Trash).
C.10 ports — native PortEnumerator + burrow_ports + Ports Tool pane (kill).
C.11 git sweep — GitRepoStatus parser + GitSweep repo-walk + bounded git runner.
D.12 alerts — new-LaunchAgent watcher fires; CPU/mem threshold alerts fire off the live snapshot stream.
D.13 restore — RestorePlan + deletion-log parse + Restore Tool pane (~/.Trash put-back).
D.14 backup — BackupStatus/tmutil → Backups Doctor check.
F Tune-Up: persistent one-tap run-all dashboard (deferred from 0.7.2) #77 Tune-Up — Tune-Up Tool pane (safe set + run).
I doctor — Doctor + burrow_doctor + Doctor Home section.

Remaining tail (surgery on shipped views / native / gated)

A.1 spike drag-select — Swift Charts gesture surgery on HistoryView (regression risk).
A.3 disk-tile annotation — needs db threaded into DiskCard (StatusView surgery).
B.6 SSE /events — gated on query-server auth (security).
B.8 full inventory diff — apps/login/port inventory persistence.
C.11 purge badges — MoInteractive checklist surgery.
D.14 SMART — IOKit NVMe (fragile native, real-hardware only).
H brew streaming — UpdatesView surgery; busy badge gated on a mo signal that doesn't exist.

Each remaining item has its tested seam already in (logic done); what's left is the view-surgery/native/gated shell.

…s (B7) Render the latest snapshot as Prometheus text exposition so a dev with Grafana can scrape their own Mac in minutes. New pure MetricsPrometheus.exposition: gauges for CPU/memory/health/uptime/battery, per-disk and per-interface labeled series, GPU omitted when unavailable (-1 on Apple Silicon), and Prometheus label-value escaping. QueryServer.route now returns (body, contentType): every route stays JSON except the new endpoint, which serves text/plain; version=0.0.4 (the de-facto scrape content type). A missing or undecodable snapshot yields a comment line rather than error JSON, so scrapers tolerate an empty target. Tests: 4 formatter cases + 2 route cases; existing route tests migrated to the (body, contentType) tuple.

Pure (ts, free-bytes) series → projection. Honest by construction: no date under a week of history, none when free space is flat or growing, and none past a 5-year horizon. Robust to single-sample cliffs (a temp file briefly eating space) via a Theil–Sen median slope rather than least-squares, which one outlier can drag arbitrarily far. Tracer-bullet core for roadmap A.3; the MCP tool + Home/History tile wiring follow as their own slices.

Pure stats: percentile/median primitives, a sustained-CPU-exceedance rule (recent median clears baseline p95 + a minimum effect size so near-idle noise can't trip it), and battery-drain regression (OLS %/hr per session, flagged when a recent discharge runs >=factor x the baseline median). The Maintenance pass writing burrow.findings + the Home card are integration.

Shared kernel for burrow_diff(since:) and the new-LaunchAgent watcher: sorted, de-duplicated added/removed over two identifier lists (apps, login items, LaunchAgents, ports, top-process membership).

Parses dirty/ahead/upstream/detached into a needs-attention verdict so the purge checklist can badge repos with uncommitted or unpushed work. Conservative: untracked files count, and a never-pushed branch is unpushed. Running git (timeout + concurrency cap) and badging the UI are integration.

System facts -> ranked ok/warn/fail checks (engine, Full Disk Access, memory pressure, disk headroom, recent errors). Level.rawValue orders worst-first. Gathering the facts + the Help-menu sheet are integration.

Pure value-folding step(): fire on crossing high, re-arm only after recovery below low, and a cooldown caps fires across episodes — one alert per episode, not per sample. Shared by notifications (D.12), SSE (B.6), the report (A.4). Rule thresholds, Sampler wiring, and UNUserNotification delivery are integration.

Pure facts -> markdown (cleanup freed, disk forecast phrased in weeks, top energy, battery delta, new-startup-items security note). Same artifact for the Home card and a burrow_report tool. Reuses Fmt.bytes; only names a fill date when DiskForecast was willing to. DB gathering is integration.

Pure: latest-backup + local-snapshot date tokens from tmutil, token->UTC Date. Backs 'Last backup: N days ago' in Clean/Purge sheets + the purgeable note. SMART/IOKit disk-health half of D.14 lands separately; tmutil invocation is integration.

Pure: recorded removals + a filesystem probe -> per-item restorable verdict. Only Trash-based removals are recoverable; a re-created origin is a collision, not an overwrite. Finder put-back + UI are integration.

…(A.3) Adds MetricsStore.diskFreeSeries (total-used per snapshot for a mount, or the largest volume) and a burrow_disk_forecast tool that runs DiskForecast over it — agents get an honest 'fills in ~N days' (or null) without the GUI. Integration test seeds declining/flat disks through the real dispatch path.

collected_at needs fractional seconds to match mo's ISO8601 decoder; without them the seeded rows dropped and the forecast saw an empty series. Also surface the tool JSON in the unwrap message for faster diagnosis.

Diffs top-process membership (via InventoryDiff) and free-space delta between the snapshot nearest `since` and the latest. v1 scope: app/login-item/port inventories aren't persisted across time yet, so they're out — noted in the reply. Integration-tested through the dispatch path.

Composes WeeklyReport from real snapshot data (disk forecast + top energy by CPU-seconds); cleanup/battery/login sections are nil='unavailable' until their sources land, never faked. spaceReclaimedBytes is now optional so 'unknown' is distinct from a genuine zero. Returns markdown as the tool text.

Composes Doctor.report from on-hand signals: mo presence (MoEngine.availability), Full Disk Access (Privacy), and memory-pressure + disk headroom from the latest snapshot, plus decode-drift count. Returns ok/warn/fail checks as JSON. Integration-tested on the snapshot-derived (deterministic) verdicts.

StartupWatcher.newlyAppeared diffs a previous inventory against the current StartupInventory scan (via InventoryDiff) → the 'new LaunchAgent/login item appeared' items. Baseline persistence + notification firing are integration.

One history→Input gather behind both burrow_report and the Home card, so the two can't drift. burrow_report now calls it. Tested by seeding snapshots.

New Home segment rendering the weekly digest from ReportComposer. zh-Hans/Hant 'Report' added. Compile-verified; needs hand-test against a real history DB.

Renders Doctor.report (same verdicts as burrow_doctor) from the latest snapshot + live FDA/engine checks, ok/warn/fail with colour. zh added. Compile-verified; needs hand-test.

Lists each ecosystem's (DevHygiene.catalog) existing cache roots with size (off-main recursive scan), biggest first, + reveal-in-Finder. Stage-2 per-item delete is a follow-up. zh added. Compile-verified; needs hand-test. NOTE: Home now has 6 segments — nav may want restructuring (your call).

Per-pid fd scan → socket fdinfo → TCP-listen + bound-UDP ports, feeding PortInspector (sort + kill-safety). No lsof, no elevation for own processes. NOTE: native C-interop, runtime-unverifiable in CI; hand-test vs lsof -i -P.

Lists listening sockets + owning process via PortEnumerator/PortInspector. Read-only. Shape-tested through dispatch (specific ports are machine-dependent).

New .ports Tool (top-nav pill) rendering PortEnumerator.listening() with a confirm-gated Quit (SIGTERM) on the user's own processes only. zh added. Compile-verified; needs hand-test (native enum + real kill).

StartupWatcher.check folds a scanLive() against a persisted baseline (Store.startupBaselineJSON), first-run-suppressing. BurrowNotifier.checkStartupWatcher runs it on the existing reminder timer (default-on Store.watchStartupItems, independent of smart reminders) and posts once per appearance. The fold is unit-tested; firing is compile-verified (hand-test the notification).

MetricsStore.processCPUSamples (name → CPU%/snapshot) feeds AnomalyScan.cpuFindings, which applies the Anomaly rule per process (recent vs baseline window) and ranks regressions. Pure + tested; the Maintenance pass writing burrow.findings + the Home Changes card are integration.

ThresholdAlerts folds CPU/memory through AlertEngine (hysteresis + cooldown, one fire per episode) given prior per-rule states. Pure + tested. Per-sample evaluation (Sampler), state persistence, and posting are integration; disk-low stays in ReminderRules.

ThresholdMonitor evaluates ThresholdAlerts on each LiveFeed.applySnapshot, holding per-rule AlertState and posting via BurrowNotifier.thresholdAlert. Off-by-default (Store.thresholdAlertsEnabled), inert under XCTest. Completes D.12's threshold half (watcher half already fires).

call() now dispatches then records a burrow.agent_audit row for the mutating tools (clean/optimize/uninstall/purge/installer): tool, dry-run flag, duration, ok, summary, args. Read tools aren't audited. db.insert is serialized + busy- timed, safe alongside the GUI writer. Tested through dispatch.

@mainactor

…eport (A.2) ThresholdMonitor posts via Task { @mainactor } (thresholdAlert is main-actor isolated). AnomalyScan.scan pulls recent-24h vs prior-14d per-process CPU; ReportComposer folds findings into the report's new Changes section. Tested.

Confirm-gated Move-to-Trash on each Dev Hygiene cache row, then re-scan. Compile-verified; needs hand-test (real Trash move).

New .tuneup Tool gathering safe (cache-clear) + review (startup) recommendations via TuneUp's tested selection logic, with a one-click Run-safe-set (trashes caches). Shared DevHygiene.directorySize. zh added. Compile-verified; hand-test.

findRangeSampled buckets by (ts-since)/stride; 60s-apart rows collapsed to one sample, below the 5-sample minimum. Spread baseline 2h / recent 10min apart so 8 survive each window. Feature was correct; the fixture clustered.

RestorePlan.parseLog turns Mole's deletion log into restore candidates; RestoreView builds the plan and offers a ~/.Trash put-back per Trash-based item (caches shown locked, permanent). New .restore Tool. parseLog tested; the Trash move is compile-verified, needs hand-test.

…(C.11) FolderGrowth.diff ranks per-folder size growth between two scans. GitSweep walks up to the containing repo and runs git status (bounded) → GitRepoStatus verdict. Diff + walk-up tested; the mo-analyze scan, git subprocess, and purge- checklist badges are integration.

BackupStatus runs tmutil latestbackup → days-ago via the tested TimeMachine parser; Doctor gains a Backups check (ok / stale-warn / no-backup-warn), surfaced in the Doctor pane + burrow_doctor. SMART/IOKit half is separate.

Additive chartOverlay on the line chart maps a drag to a [t0,t1] window and opens SpikeSheet, which ranks processes via the tested MetricsStore.processWindow. Overlay-only — doesn't touch the chart's marks/axes. zh added. Compile-verified.

DiskCard takes an optional db and annotates the Status disk tile with the DiskForecast over 30d of free-space history (silent unless a date is honest). Exposed StatusModel.db. Additive line; zh added.

Maintenance persists a time-indexed login-item/LaunchAgent inventory (burrow.startup_inv); burrow_diff now reports login_items_added/removed since <time> via InventoryDiff. Tested through dispatch. Apps/ports still untracked.

… init)

DiskHealth reads the NVMe SMART verdict from system_profiler (no private API); Doctor gains a Disk-health check (verified=ok / failing=fail / unreadable=ok). Surfaced in the Doctor pane + burrow_doctor. wear%/temp still need IOKit.

runBrewStreaming pipes brew output line-by-line; upgrade()/upgradeAll() feed each line through BrewProgress.phrase and publish brewPhrase, shown live in the row instead of a bare spinner. NOTE (hand-test): exercises a real brew upgrade.

MoItemRow runs GitSweep (repo-walk + git status, off-main) per row and shows a warning badge when the candidate's folder has uncommitted/unpushed work. Read-only — never changes the selection. zh added.

EventHub holds open event-stream connections and fans out SSE frames (dead ones self-evict). QueryServer serves GET /events?token=… (Store.queryAuthToken; loopback-only server), exempting the stream from the idle-cancel timeout. Threshold + new-startup-item alerts broadcast onto it. Token parse unit-tested.

'SSE' resolved as an out-of-scope symbol in Notifications.swift (imports AppKit); EventHub/QueryServer (no AppKit) were fine. Renamed the type to disambiguate.

caezium added 20 commits June 15, 2026 09:34

feat(diff): set-membership inventory diff (B.8 + D.12)

d8fc3e6

Shared kernel for burrow_diff(since:) and the new-LaunchAgent watcher: sorted, de-duplicated added/removed over two identifier lists (apps, login items, LaunchAgents, ports, top-process membership).

feat(doctor): diagnostics verdict composer (roadmap I)

5088ef6

System facts -> ranked ok/warn/fail checks (engine, Full Disk Access, memory pressure, disk headroom, recent errors). Level.rawValue orders worst-first. Gathering the facts + the Help-menu sheet are integration.

feat(restore): restore-last-cleanup planning (D.13)

37deee0

Pure: recorded removals + a filesystem probe -> per-item restorable verdict. Only Trash-based removals are recoverable; a re-created origin is a collision, not an overwrite. Finder put-back + UI are integration.

feat(audit): agent-action audit record + encoding (B.5)

c8a5765

feat(ports): listening-port model + kill-safety rule (C.10)

a62356d

feat(tuneup): safe-set selection for one-click Tune-Up (#77)

592b457

feat(updates): brew-upgrade progress line parser (H)

87c7306

feat(hygiene): dev-ecosystem cache catalog (C.9)

9488536

test(forecast): fix snapshot fixture date format so rows decode

241f0d4

collected_at needs fractional seconds to match mo's ISO8601 decoder; without them the seeded rows dropped and the forecast saw an empty series. Also surface the tool JSON in the unwrap message for faster diagnosis.

caezium changed the title ~~feat(query): Prometheus exposition at /metrics?format=prometheus (B7)~~ roadmap A–I: logic cores + agent-surface tools (TDD, CI-green) Jun 15, 2026

caezium added 8 commits June 15, 2026 11:05

feat(watcher): new-persistence-item detection (D.12)

bafeafe

StartupWatcher.newlyAppeared diffs a previous inventory against the current StartupInventory scan (via InventoryDiff) → the 'new LaunchAgent/login item appeared' items. Baseline persistence + notification firing are integration.

refactor(report): shared ReportComposer.gather for GUI + MCP (A.4)

00d4c0d

One history→Input gather behind both burrow_report and the Home card, so the two can't drift. burrow_report now calls it. Tested by seeding snapshots.

feat(report): Report section on Home (A.4 GUI)

7b49813

New Home segment rendering the weekly digest from ReportComposer. zh-Hans/Hant 'Report' added. Compile-verified; needs hand-test against a real history DB.

feat(doctor): Diagnostics section on Home (roadmap I GUI)

4014ff6

Renders Doctor.report (same verdicts as burrow_doctor) from the latest snapshot + live FDA/engine checks, ok/warn/fail with colour. zh added. Compile-verified; needs hand-test.

feat(ports): burrow_ports MCP tool over native enumeration (C.10)

6b07fc3

Lists listening sockets + owning process via PortEnumerator/PortInspector. Read-only. Shape-tested through dispatch (specific ports are machine-dependent).

feat(ports): Ports inspector pane as a Tool (C.10 GUI)

e7a74f1

New .ports Tool (top-nav pill) rendering PortEnumerator.listening() with a confirm-gated Quit (SIGTERM) on the user's own processes only. zh added. Compile-verified; needs hand-test (native enum + real kill).

caezium changed the title ~~roadmap A–I: logic cores + agent-surface tools (TDD, CI-green)~~ roadmap A–I: logic cores + agent tools + initial GUI (TDD, CI-green) Jun 15, 2026

caezium added 12 commits June 15, 2026 11:49

feat(hygiene): stage-2 per-item Clear-to-Trash (C.9)

7e2a2e2

Confirm-gated Move-to-Trash on each Dev Hygiene cache row, then re-scan. Compile-verified; needs hand-test (real Trash move).

feat(tuneup): Tune-Up Tool pane (#77)

8e6ec65

New .tuneup Tool gathering safe (cache-clear) + review (startup) recommendations via TuneUp's tested selection logic, with a one-click Run-safe-set (trashes caches). Shared DevHygiene.directorySize. zh added. Compile-verified; hand-test.

caezium changed the title ~~roadmap A–I: logic cores + agent tools + initial GUI (TDD, CI-green)~~ roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green) Jun 16, 2026

caezium added 9 commits June 15, 2026 21:05

feat(forecast): disk-tile 'Full in ~N' annotation (A.3)

cbc8301

DiskCard takes an optional db and annotates the Status disk tile with the DiskForecast over 30d of free-space history (silent unless a date is honest). Exposed StatusModel.db. Additive line; zh added.

feat(diff): login-item churn in burrow_diff (B.8)

f205704

Maintenance persists a time-indexed login-item/LaunchAgent inventory (burrow.startup_inv); burrow_diff now reports login_items_added/removed since <time> via InventoryDiff. Tested through dispatch. Apps/ports still untracked.

fix(forecast): DiskCard arg order (db follows minHeight in memberwise…

d48518f

… init)

feat(updates): live brew-upgrade progress (H)

6263d69

runBrewStreaming pipes brew output line-by-line; upgrade()/upgradeAll() feed each line through BrewProgress.phrase and publish brewPhrase, shown live in the row instead of a bare spinner. NOTE (hand-test): exercises a real brew upgrade.

feat(purge): git purge-safety badge on selection rows (C.11)

03783ae

MoItemRow runs GitSweep (repo-walk + git status, off-main) per row and shows a warning badge when the candidate's folder has uncommitted/unpushed work. Read-only — never changes the selection. zh added.

fix(sse): rename SSE → SSEFrame (avoid AppKit-scope symbol clash)

20637b2

'SSE' resolved as an out-of-scope symbol in Notifications.swift (imports AppKit); EventHub/QueryServer (no AppKit) were fine. Renamed the type to disambiguate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green)#80

roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green)#80
caezium wants to merge 49 commits into
mainfrom
feat/intelligence-agent-surface

caezium commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caezium commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done end-to-end (logic + agent + GUI/firing, all green)

Remaining tail (surgery on shipped views / native / gated)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

caezium commented Jun 15, 2026 •

edited

Loading