Skip to content

roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green)#80

Draft
caezium wants to merge 49 commits into
mainfrom
feat/intelligence-agent-surface
Draft

roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green)#80
caezium wants to merge 49 commits into
mainfrom
feat/intelligence-agent-surface

Conversation

@caezium

@caezium caezium commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Tracks A, B, C, D, F, H, I of feature-roadmap-2026-06-10, TDD'd, every commit CI-green. Logic + agent surface complete; most features have a working GUI pane and/or live firing; a small surgical/native/gated tail remains (listed last).

Done end-to-end (logic + agent + GUI/firing, all green)

  • A.2 anomaly — rules + AnomalyScan over history, surfaced as a Changes section in the report.
  • A.3 forecastDiskForecast + burrow_disk_forecast + in the report; growth-attribution diff (FolderGrowth).
  • A.4 report — composer + ReportComposer + burrow_report + Report Home section.
  • B.5 audit — mutating MCP tools write burrow.agent_audit rows.
  • B.7 prometheus/metrics?format=prometheus.
  • B.8 diffburrow_diff v1 (processes + disk).
  • C.9 dev hygiene — pane (stage-1 scan + stage-2 Clear-to-Trash).
  • C.10 ports — native PortEnumerator + burrow_ports + Ports Tool pane (kill).
  • C.11 git sweepGitRepoStatus parser + GitSweep repo-walk + bounded git runner.
  • D.12 alerts — new-LaunchAgent watcher fires; CPU/mem threshold alerts fire off the live snapshot stream.
  • D.13 restoreRestorePlan + deletion-log parse + Restore Tool pane (~/.Trash put-back).
  • D.14 backupBackupStatus/tmutil → Backups Doctor check.
  • F Tune-Up: persistent one-tap run-all dashboard (deferred from 0.7.2) #77 Tune-UpTune-Up Tool pane (safe set + run).
  • I doctorDoctor + burrow_doctor + Doctor Home section.

Remaining tail (surgery on shipped views / native / gated)

  • A.1 spike drag-select — Swift Charts gesture surgery on HistoryView (regression risk).
  • A.3 disk-tile annotation — needs db threaded into DiskCard (StatusView surgery).
  • B.6 SSE /events — gated on query-server auth (security).
  • B.8 full inventory diff — apps/login/port inventory persistence.
  • C.11 purge badges — MoInteractive checklist surgery.
  • D.14 SMART — IOKit NVMe (fragile native, real-hardware only).
  • H brew streaming — UpdatesView surgery; busy badge gated on a mo signal that doesn't exist.

Each remaining item has its tested seam already in (logic done); what's left is the view-surgery/native/gated shell.

caezium added 20 commits June 15, 2026 09:34
…s (B7)

Render the latest snapshot as Prometheus text exposition so a dev with
Grafana can scrape their own Mac in minutes. New pure
MetricsPrometheus.exposition: gauges for CPU/memory/health/uptime/battery,
per-disk and per-interface labeled series, GPU omitted when unavailable
(-1 on Apple Silicon), and Prometheus label-value escaping.

QueryServer.route now returns (body, contentType): every route stays JSON
except the new endpoint, which serves text/plain; version=0.0.4 (the
de-facto scrape content type). A missing or undecodable snapshot yields a
comment line rather than error JSON, so scrapers tolerate an empty target.

Tests: 4 formatter cases + 2 route cases; existing route tests migrated to
the (body, contentType) tuple.
Pure (ts, free-bytes) series → projection. Honest by construction: no date
under a week of history, none when free space is flat or growing, and none
past a 5-year horizon. Robust to single-sample cliffs (a temp file briefly
eating space) via a Theil–Sen median slope rather than least-squares, which
one outlier can drag arbitrarily far.

Tracer-bullet core for roadmap A.3; the MCP tool + Home/History tile wiring
follow as their own slices.
Pure stats: percentile/median primitives, a sustained-CPU-exceedance rule
(recent median clears baseline p95 + a minimum effect size so near-idle
noise can't trip it), and battery-drain regression (OLS %/hr per session,
flagged when a recent discharge runs >=factor x the baseline median). The
Maintenance pass writing burrow.findings + the Home card are integration.
Shared kernel for burrow_diff(since:) and the new-LaunchAgent watcher:
sorted, de-duplicated added/removed over two identifier lists (apps, login
items, LaunchAgents, ports, top-process membership).
Parses dirty/ahead/upstream/detached into a needs-attention verdict so the
purge checklist can badge repos with uncommitted or unpushed work.
Conservative: untracked files count, and a never-pushed branch is unpushed.
Running git (timeout + concurrency cap) and badging the UI are integration.
System facts -> ranked ok/warn/fail checks (engine, Full Disk Access, memory
pressure, disk headroom, recent errors). Level.rawValue orders worst-first.
Gathering the facts + the Help-menu sheet are integration.
Pure value-folding step(): fire on crossing high, re-arm only after recovery
below low, and a cooldown caps fires across episodes — one alert per episode,
not per sample. Shared by notifications (D.12), SSE (B.6), the report (A.4).
Rule thresholds, Sampler wiring, and UNUserNotification delivery are integration.
Pure facts -> markdown (cleanup freed, disk forecast phrased in weeks, top
energy, battery delta, new-startup-items security note). Same artifact for
the Home card and a burrow_report tool. Reuses Fmt.bytes; only names a fill
date when DiskForecast was willing to. DB gathering is integration.
Pure: latest-backup + local-snapshot date tokens from tmutil, token->UTC
Date. Backs 'Last backup: N days ago' in Clean/Purge sheets + the purgeable
note. SMART/IOKit disk-health half of D.14 lands separately; tmutil
invocation is integration.
Pure: recorded removals + a filesystem probe -> per-item restorable verdict.
Only Trash-based removals are recoverable; a re-created origin is a collision,
not an overwrite. Finder put-back + UI are integration.
…(A.3)

Adds MetricsStore.diskFreeSeries (total-used per snapshot for a mount, or the
largest volume) and a burrow_disk_forecast tool that runs DiskForecast over
it — agents get an honest 'fills in ~N days' (or null) without the GUI. Integration test seeds declining/flat disks through the real dispatch path.
collected_at needs fractional seconds to match mo's ISO8601 decoder; without
them the seeded rows dropped and the forecast saw an empty series. Also
surface the tool JSON in the unwrap message for faster diagnosis.
Diffs top-process membership (via InventoryDiff) and free-space delta between
the snapshot nearest `since` and the latest. v1 scope: app/login-item/port
inventories aren't persisted across time yet, so they're out — noted in the
reply. Integration-tested through the dispatch path.
Composes WeeklyReport from real snapshot data (disk forecast + top energy by
CPU-seconds); cleanup/battery/login sections are nil='unavailable' until
their sources land, never faked. spaceReclaimedBytes is now optional so
'unknown' is distinct from a genuine zero. Returns markdown as the tool text.
Composes Doctor.report from on-hand signals: mo presence (MoEngine.availability),
Full Disk Access (Privacy), and memory-pressure + disk headroom from the latest
snapshot, plus decode-drift count. Returns ok/warn/fail checks as JSON.
Integration-tested on the snapshot-derived (deterministic) verdicts.
@caezium caezium changed the title feat(query): Prometheus exposition at /metrics?format=prometheus (B7) roadmap A–I: logic cores + agent-surface tools (TDD, CI-green) Jun 15, 2026
caezium added 8 commits June 15, 2026 11:05
StartupWatcher.newlyAppeared diffs a previous inventory against the current
StartupInventory scan (via InventoryDiff) → the 'new LaunchAgent/login item
appeared' items. Baseline persistence + notification firing are integration.
One history→Input gather behind both burrow_report and the Home card, so the
two can't drift. burrow_report now calls it. Tested by seeding snapshots.
New Home segment rendering the weekly digest from ReportComposer. zh-Hans/Hant
'Report' added. Compile-verified; needs hand-test against a real history DB.
Renders Doctor.report (same verdicts as burrow_doctor) from the latest
snapshot + live FDA/engine checks, ok/warn/fail with colour. zh added.
Compile-verified; needs hand-test.
Lists each ecosystem's (DevHygiene.catalog) existing cache roots with size
(off-main recursive scan), biggest first, + reveal-in-Finder. Stage-2
per-item delete is a follow-up. zh added. Compile-verified; needs hand-test.
NOTE: Home now has 6 segments — nav may want restructuring (your call).
Per-pid fd scan → socket fdinfo → TCP-listen + bound-UDP ports, feeding
PortInspector (sort + kill-safety). No lsof, no elevation for own processes.
NOTE: native C-interop, runtime-unverifiable in CI; hand-test vs lsof -i -P.
Lists listening sockets + owning process via PortEnumerator/PortInspector.
Read-only. Shape-tested through dispatch (specific ports are machine-dependent).
New .ports Tool (top-nav pill) rendering PortEnumerator.listening() with a
confirm-gated Quit (SIGTERM) on the user's own processes only. zh added.
Compile-verified; needs hand-test (native enum + real kill).
@caezium caezium changed the title roadmap A–I: logic cores + agent-surface tools (TDD, CI-green) roadmap A–I: logic cores + agent tools + initial GUI (TDD, CI-green) Jun 15, 2026
caezium added 12 commits June 15, 2026 11:49
StartupWatcher.check folds a scanLive() against a persisted baseline
(Store.startupBaselineJSON), first-run-suppressing. BurrowNotifier.checkStartupWatcher
runs it on the existing reminder timer (default-on Store.watchStartupItems,
independent of smart reminders) and posts once per appearance. The fold is
unit-tested; firing is compile-verified (hand-test the notification).
MetricsStore.processCPUSamples (name → CPU%/snapshot) feeds AnomalyScan.cpuFindings,
which applies the Anomaly rule per process (recent vs baseline window) and
ranks regressions. Pure + tested; the Maintenance pass writing burrow.findings
+ the Home Changes card are integration.
ThresholdAlerts folds CPU/memory through AlertEngine (hysteresis + cooldown,
one fire per episode) given prior per-rule states. Pure + tested. Per-sample
evaluation (Sampler), state persistence, and posting are integration;
disk-low stays in ReminderRules.
ThresholdMonitor evaluates ThresholdAlerts on each LiveFeed.applySnapshot,
holding per-rule AlertState and posting via BurrowNotifier.thresholdAlert.
Off-by-default (Store.thresholdAlertsEnabled), inert under XCTest. Completes
D.12's threshold half (watcher half already fires).
call() now dispatches then records a burrow.agent_audit row for the mutating
tools (clean/optimize/uninstall/purge/installer): tool, dry-run flag, duration,
ok, summary, args. Read tools aren't audited. db.insert is serialized + busy-
timed, safe alongside the GUI writer. Tested through dispatch.
…eport (A.2)

ThresholdMonitor posts via Task { @mainactor } (thresholdAlert is main-actor
isolated). AnomalyScan.scan pulls recent-24h vs prior-14d per-process CPU;
ReportComposer folds findings into the report's new Changes section. Tested.
Confirm-gated Move-to-Trash on each Dev Hygiene cache row, then re-scan.
Compile-verified; needs hand-test (real Trash move).
New .tuneup Tool gathering safe (cache-clear) + review (startup) recommendations
via TuneUp's tested selection logic, with a one-click Run-safe-set (trashes
caches). Shared DevHygiene.directorySize. zh added. Compile-verified; hand-test.
findRangeSampled buckets by (ts-since)/stride; 60s-apart rows collapsed to one
sample, below the 5-sample minimum. Spread baseline 2h / recent 10min apart so
8 survive each window. Feature was correct; the fixture clustered.
RestorePlan.parseLog turns Mole's deletion log into restore candidates;
RestoreView builds the plan and offers a ~/.Trash put-back per Trash-based
item (caches shown locked, permanent). New .restore Tool. parseLog tested;
the Trash move is compile-verified, needs hand-test.
…(C.11)

FolderGrowth.diff ranks per-folder size growth between two scans. GitSweep
walks up to the containing repo and runs git status (bounded) → GitRepoStatus
verdict. Diff + walk-up tested; the mo-analyze scan, git subprocess, and purge-
checklist badges are integration.
BackupStatus runs tmutil latestbackup → days-ago via the tested TimeMachine
parser; Doctor gains a Backups check (ok / stale-warn / no-backup-warn),
surfaced in the Doctor pane + burrow_doctor. SMART/IOKit half is separate.
@caezium caezium changed the title roadmap A–I: logic cores + agent tools + initial GUI (TDD, CI-green) roadmap A–I: logic + agent tools + GUI + firing (TDD, CI-green) Jun 16, 2026
caezium added 9 commits June 15, 2026 21:05
Additive chartOverlay on the line chart maps a drag to a [t0,t1] window and
opens SpikeSheet, which ranks processes via the tested MetricsStore.processWindow.
Overlay-only — doesn't touch the chart's marks/axes. zh added. Compile-verified.
DiskCard takes an optional db and annotates the Status disk tile with the
DiskForecast over 30d of free-space history (silent unless a date is honest).
Exposed StatusModel.db. Additive line; zh added.
Maintenance persists a time-indexed login-item/LaunchAgent inventory
(burrow.startup_inv); burrow_diff now reports login_items_added/removed since
<time> via InventoryDiff. Tested through dispatch. Apps/ports still untracked.
DiskHealth reads the NVMe SMART verdict from system_profiler (no private API);
Doctor gains a Disk-health check (verified=ok / failing=fail / unreadable=ok).
Surfaced in the Doctor pane + burrow_doctor. wear%/temp still need IOKit.
runBrewStreaming pipes brew output line-by-line; upgrade()/upgradeAll() feed
each line through BrewProgress.phrase and publish brewPhrase, shown live in the
row instead of a bare spinner. NOTE (hand-test): exercises a real brew upgrade.
MoItemRow runs GitSweep (repo-walk + git status, off-main) per row and shows a
warning badge when the candidate's folder has uncommitted/unpushed work.
Read-only — never changes the selection. zh added.
EventHub holds open event-stream connections and fans out SSE frames (dead ones
self-evict). QueryServer serves GET /events?token=… (Store.queryAuthToken;
loopback-only server), exempting the stream from the idle-cancel timeout.
Threshold + new-startup-item alerts broadcast onto it. Token parse unit-tested.
'SSE' resolved as an out-of-scope symbol in Notifications.swift (imports AppKit);
EventHub/QueryServer (no AppKit) were fine. Renamed the type to disambiguate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant